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EDITORIAL 


Einstein v. Roberts 


n the recent U.S. Supreme Court hearing on A. 
Fisher v. the University of Texas about university 
admission policies regarding minority students, 
Chief Justice John Roberts asked, “What unique 
perspective does a minority student bring to a 
physics class?” As an African-American physicist 
researching string theory, and a teacher of univer- 
sity students since 1972, I have a response. 

Issues related to race in the United States have created 
barriers since the nation’s founding, determining which 
citizens experience benefits, and which deprivations. This 
problem is not new for physicists. Albert Einstein’s essay 
“The Negro Question” includes “What...can the man of 
good will do to combat this 
deeply rooted prejudice? 
He must have the courage 
to set an example by word 
and deed, and must watch 
lest his children become in- 
fluenced by this racial bias.” 
Einstein described racism as 
a “disease, and he recom- 
mended principles to end 
discrimination, aligning with 
the O. Brown v. the Board of 
Education of Topeka, Kan- 
sas, decision by the Supreme 
Court in 1954 to desegregate 
public schools. 

Chief Justice Roberts’ 
question—premised on the 
idea that a person’s back- 
ground, including race, is ir- 
relevant in science—shows 
a fundamental misunder- 
standing of both science and 
human creativity. Science is a creative process, which is 
why the enterprise needs diverse thinking. The Chief 
Justice’s physics class may have consisted of plugging 
numbers into equations to find how long a dropped ball 
takes to reach the ground. But today’s science classes 
often strive for creative exploration and collaboration 
to foster innovation. This played out recently in my 
upper-level undergraduate class. Students worked in 
small groups to solve a problem involving vector cal- 
culus and group theory—mathematics related to the 
discovery of subatomic particles such as the Higgs bo- 
son. In one group, two European-American students led 
discussions into a mathematical dead end. An African- 
American group member eventually wrote something 
on the board, which, when finally noticed, unstuck the 


“..a different perspective is 
an asset in science...” 


group, and the problem was solved. Days afterward, the 
situation recurred, but this time, the group paid atten- 
tion to the minority student, asking, “How did you come 
to that answer?” The students learned more than vec- 
tor calculus that day. The majority students understood 
that a different perspective is an asset in science, while 
the minority student gained peer creditability and con- 
fidence. Together, the members became more eficient as 
problem solvers. 

This is only an anecdote, but it shows what can 
happen in real-world classrooms. Several books have 
described the efficacy of diversity for the sake of innova- 
tion. For example, Scott E. Page explains that diversity’s 
superiority emerges when a 
problem is difficult; that is, 
when no single individual 
always finds a_ solution, 
particularly in situations re- 
quiring creativity. 

In 1969, I entered the 
Massachusetts Institute of 
Technology (MIT) expecting 
to be different from most of 
the other new undergradu- 
ate students. Although of- 
ten challenging, I found 
that my difference could be 
an advantage: Distinctive 
backgrounds can lead to dif- 
ferent approaches to fram- 
ing problems. If MIT had 
been legally bound then to 
admissions based solely on 
test scores, I would never 
have been admitted. It 
would have been a personal 
loss, but more importantly, unique mathematical and 
physics ideas created in my career, and tied to my idio- 
syncratic framing of problems, might never have seen 
the light of day. 

Minorities have made progress in science, as my 
own life attests. People of color and women are the 
fastest-growing segments of the U.S. college population. 
But discrimination continues. Each individual brings 
unique experiences that influence the capacity to move 
science forward in creative ways. Colleges need ways to 
recognize this in admissions processes. In the mean- 
time, the Fisher v. Texas decision is poised to shape how 
and whether people like me can emerge in future sci- 
ence at its highest levels. 

-S. J. Gates, Jr. 
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&64|'m terribly sorry about all of this, @NERCscience. 99 


@JamesHand, a former BBC presenter, whose joking suggestion, 
Boaty McBoatface, now dominates an online contest to name the U.K. Natural 
Environment Research Council’s new £200 million ship. The winner will be 
chosen from the top nominees by a committee. 


IN BRIEF AROUND THE WORLD 


Lassa fever case in Germany 


FRANKFURT, GERMANY | A German morti- 
7 7 4 cian is being treated for Lassa fever at a 
SeaWorld SayS it Wi | stop breeding orcas hospital in Frankfurt after handling the 
body of a U.S. Agency for International 
: = : Development worker who had apparently 
contracted the virus in Togo. It appears to 
be the first clear case of human-to-human 
transmission of the deadly virus outside 
of West Africa, where Lassa is endemic. 
(In 2000, a physican who treated a Lassa 
patient in Germany developed antibodies 
against the virus but did not fall ill.) The 
World Health Organization announced an 
outbreak of Lassa fever in Benin last month, 
and neighboring Nigeria and Togo have 
also reported cases. Nigeria alone has seen 
close to 400 cases and at least 63 deaths 
this year, according to the Nigeria Centre 
for Disease Control. Another U.S. citizen 
who worked in Togo is being treated for 
Lassa fever at Emory Hospital in Atlanta. 


ee China sets lab animal guidelines 
Ee Ea es 


" = BEIJING | China has released its first 
Killer whale Tilikum, subject of the 2013 documentary Blackfish, watches his trainers at SeaWorld. national standards governing the treat- 
ment of laboratory animals. The draft 
eaWorld announced last week that it will end orca breeding at eer: which cover topics including 
: 3 ‘ : : euthanasia, pain management, transport, 
its marine parks and will phase out killer whale shows, focusing sid = . 
; oe ; and housing, were posted last week for 
instead on programs that emphasize “orca enrichment, exercise public comment and may be implemented 


and overall health.” The move comes after years of pressure by 

animal rights and animal welfare advocates, as well as some 

scientists, who have argued that cetaceans (a group including 
whales and dolphins) shouldn’t be kept in captivity. Other researchers, 
however, express concern that removing all cetaceans from captivity 
could curtail important research that cannot be replicated in the wild. 
(There are currently about 600 cetaceans kept in 34 facilities in North 
America.) Studies of SeaWorld’s orcas have explored killer whale 
breeding and physiology, but how much research the park’s animals 
have been part of is unclear. As for its dolphins, SeaWorld has made no 
announcement. “I think the move is a good decision for killer whales, 
who can travel up to 160 kilometers in a day,’ says animal behaviorist 
Richard Connor of the University of Massachusetts, Dartmouth. “But 
dolphins are more amenable to captivity, and we’ve just scratched the 
surface of what we can learn from them.” http://scim.ag/_SeaWorld 


EEE SELES 


Rhesus macaques in a Beijing lab. 
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by the end of this year. Scientists hope the 
guidelines will not only improve condi- 
tions for animals, but also improve China’s 
prospects for international research col- 
laborations. Meanwhile, there is growing 
domestic concern about the mistreatment 
of lab animals because of recently docu- 
mented incidents of abuse. According to 
the Chinese Academy of Medical Sciences’s 
Institute of Laboratory Animal Sciences 
in Beijing, China uses roughly 20 million 
animals—mostly mice, but also large 
numbers of dogs, rabbits, and nonhuman 
primates—each year in research. 
http://scim.ag/Chinalabanimal 


Allen invests big in bioscience 


WASHINGTON, D.c. | Philanthropist Paul 
G. Allen announced this week the creation 
of a new bioscience research initiative, 
which he is launching with $100 mil- 

lion over the next 10 years. To determine 
which investigators would receive the 
first grants from the new Paul G. Allen 
Frontiers Group, “we asked everyone the 
same question: What is the dark matter 
of bioscience?” says the group’s executive 
director, biomedical engineer Tom Skalak 
in Seattle, Washington. Four researchers— 
Jennifer Doudna of the University of 
California (UC), Berkeley, Ethan Bier 

of UC San Diego, James Collins of the 
Massachusetts Institute of Technology in 
Cambridge, and Bassem Hassan of the 
Brain and Spine Institute in Paris—will 
receive $1.5 million from the Frontiers 
Group to study topics ranging from novel 
techniques for gene editing, how shapes 
and forms arise over the course of evolu- 
tion, and how synthetic biology can create 
microbes that can trap and kill danger- 
ous bacteria. Allen will also fund two new 
research centers at Stanford University in 
Palo Alto, California, and Tufts University 
in Boston. 


NSF funding rules may ease 


WASHINGTON, D.c. | U.S. Representative 
John Culberson (R-TX) says that he no 
longer favors specifying funding levels 
for individual research directorates at 
the National Science Foundation (NSF). 
If Culberson’s change of heart is real, it 
would be a significant victory for the U.S. 
research community, which has accused 
him and other congressional Republicans 
of imposing their own research pri- 
orities above those set by the agency 

and working scientists. Culberson, who 
chairs the spending subcommittee in the 
House of Representatives that controls 
NSF's budget, pushed NSF last year to 
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Hubble unveils monster stars 


he star cluster R136 is already home to the largest known star in the universe, 

a giant more than 250 times the mass of the sun. Now, astronomers using the 

Hubble Space Telescope to observe the cluster in ultraviolet light have found a 

total of nine stars with masses of more than 100 suns. The pack of heavyweights 

is located in the Tarantula Nebula (shown above, with R136 below center), some 
170,000 light-years from Earth. How they form is a mystery; the current theory of star 
formation cannot explain how such behemoths could come together from the collapse 
of a cloud of gas and dust. One possible explanation was that giant stars could grow 
through the merger of binary star pairs, but that still wouldn't explain this number of 
giants in close proximity, the team wrote last week in the Monthly Notices of the Royal 
Astronomical Society. They plan to continue observing R136 with Hubble in visible 
light, searching for binaries that could merge to produce such massive stars. 


boost spending on four of its six research 
directorates—biology, computing, engi- 
neering, and mathematics and physical 
sciences—while trimming its investments 
in the social and geosciences. He noted his 
second thoughts last week after a hearing 
on NSF’s 2017 budget request, in which 
NSF Director France Cordova made a plea 
for creating a portfolio based on the 
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most exciting research across all fields. 
http://scim.ag/NSFCulberson 


Collections support suspended 
ARLINGTON, VIRGINIA | The U.S. National 
Science Foundation (NSF) announced last 
week that it has indefinitely suspended 
its Collections in Support of Biological 
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Radar scans of the walls of 
Tutankhamun’s tomb suggest 
concealed rooms. 


Tut’s tomb may hide secret chambers 


ing Tutankhamun's famously rich tomb has been painstakingly scrutinized since 

its discovery by Western archaeologists in 1922. But it may still hide unexplored 

treasures—possibly even the tomb of the legendary Egyptian queen Nefertiti. 

On 17 March, Egypt's Antiquities Minister Mamdouh Eldamaty announced new 

finds from the tomb at a press conference held in Cairo. In November 2015, high- 
resolution radar scans of the tomb had suggested that additional chambers lay behind 
the walls of the pharaoh’s burial chamber. Now, more detailed analyses of the data 
confirm the existence of two hidden chambers, with objects inside apparently made 
of metal and organic matter, Eldamaty said. No one knows yet what these objects are, 
but certain artifacts in King Tut's tomb—which some scholars suggest appear to have 
been made for someone else—offer the tantalizing prospect that the hidden chambers 
could contain Nefertiti’s remains. Researchers will conduct more scans before making 


any plans to excavate the chambers. 


Research (CSBR) program, one of the only 
public funding sources for basic museum 
infrastructure in the country. CSBR 
funded projects to maintain biological 
collections, such as upgrading freezers 
for tissue samples and building cabinets 
for plant collections. NSF will continue 

to fund existing grants, but won’t accept 
new applications in 2016 while it assesses 
the future of the program. The move has 
sparked concern among members of the 
museum community, who worry that NSF 
is funding specific research projects while 
failing to support the infrastructure they 
rely on. Biological collections support 

a wide range of research projects, from 
describing new species to assessing the 
impacts of climate change on ecosystems. 


Backup plan for unfunded grants 
BETHESDA, MARYLAND | Researchers 
whose grant proposals are rejected by the 
U.S. National Institutes of Health (NIH) 
may soon have a new way to find support 
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for their research. The Online Partnership 
to Accelerate Research (OnPAR), launched 
in pilot form this month, aims to play 
matchmaker between rejected NIH 
projects and second-chance funders, 

such as nonprofit disease research foun- 
dations or pharmaceutical companies. 

A collaboration between NIH and the 
defense, engineering, and health contrac- 
tor Leidos, headquartered in Reston, 
Virginia, OnPAR lets researchers upload 
unfunded NIH proposals to an online 
portal where potential funders can review 
their scores and decide whether to put up 
cash. A similar project run by the National 
Health Council folded in 2012 after failing 
to secure any commitments from poten- 
tial funders. The year-long OnPAR pilot, 
described in an editorial this week in 
Science Translational Medicine, has signed 
up seven potential funders—all nonprofit 
disease foundations, but its creators hope 
it will expand to include biotech compa- 
nies and venture capital firms. 
http://scim.ag/Allengrants 
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BY THE NUMBERS 


4% 


Share of grants funded by the 
National Science Foundation last 
year whose titles were changed to 
make them clearer to the public. 
http://scim.ag/NSFtitles 


The average life span reduction, in 
years, for people with autism spectrum 
disorder compared with the general 
population, according to a report 
by the philanthropic group Autistica. 
http://scim.ag/Autisticarep 


95% 


Fraction of some 1976 wild relatives 
of 81 crops that aren't adequately 
conserved in world gene banks, although 
they may have valuable traits, such 
as drought and pest resistance (Nature 
Plants). http://scim.ag/wildcrops 


NEWSMAKERS 
Simon to lead cancer moonshot 


Greg Simon, a leukemia survivor, corpo- 
rate executive, and veteran Washington, 
D.C., insider, has been tapped to head Vice 
President Joe Biden’s $1 billion “moonshot” 
to cure cancer. After working as a con- 
gressional aide and adviser to then-Vice 
President Al Gore, Simon joined with philan- 
thropist Michael Milken to start FasterCures, 
a think tank aimed at speeding the develop- 
ment of cancer treatments. He moved on to 
Pfizer and is now CEO of Poliwogg, a life 
sciences investment company. The 64-year- 
old lawyer was diagnosed with chronic 
lymphocytic leukemia in 2014; after 
chemotherapy, he is now healthy. As execu- 
tive director of the Cancer Moonshot Task 
Force, Simon will work with federal agencies 
to break down silos in cancer research 
(http://scim.ag/_cancermoon). President 
Barack Obama is spending $195 million on 
moonshot activities this year and has asked 
Congress for $755 million in 2017. 
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IN DEPTH 


PARTICLE PHYSICS 


Crunch time for dark matter hunt 
Little confidence that biggest WIMP detector ever will find hypothesized particles 


By Adrian Cho 


his month, in a cavernous labora- 

tory 1.4 kilometers below Gran Sasso 

d'Italia, the tallest peak in Italy’s Apen- 

nine Mountains, physicists will begin 

filling a cylindrical tank with liquid 

xenon, a frigid substance three times 
as dense as water. About 1 meter tall and 
wide, the tank will form the heart of XENON 
1 Ton (XENONIT), the biggest detector so far 
to hunt for weakly interacting massive par- 
ticles (WIMPs), hypothetical particles that 
may make up the mysterious dark matter 
that pervades our galaxy and binds it with its 
gravity. But even as XENONIT crosses a key 
size threshold in physicists’ 30-year search 
for WIMPs, researchers are starting to have 
doubts about the concept of WIMPs. 

“Tt’s crunch time,’ says Rocky Kolb, a 
theorist at the University of Chicago (UC) 
in Illinois, comparing the WIMP quest 
to a football game. “We’re in the second 
half, maybe the fourth quarter. We haven’t 
scored.” Elena Aprile, an experimentalist 
at Columbia University and leader of the 
XENON team, says the new detector has a 
real shot at discovering WIMPs: “They could 
be right around the corner.” But, she adds, 
“We're perhaps losing faith in the sense that 
we're not sure whether 1 ton or even 10 tons 
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will be enough to see anything.” 

Other dark matter researchers share her 
concerns. A few years ago, when the biggest 
WIMP detector weighed a few kilograms, 
most thought that a 1-ton experiment would 
either find WIMPs or stick a dagger in the 
idea. But generations of ever bigger detectors 
have come up empty, and physicists are re- 
thinking the argument for WIMPs and what 
it might take to find them. They have bigger 
detectors in the works and are laying plans 
for the ultimate WIMP detector. But even 
avid dark matter hunters aren’t sure that the 
giant detector is worth pursuing. 

The concept of the WIMP dates back to 
the early 1980s. In 1983, experimenters at 
CERN, the European particle physics labo- 
ratory near Geneva, Switzerland, discovered 
the particles that convey the so-called weak 
nuclear force: the fleeting W boson and 
Z boson. At the time, theorists realized that if 
the bosons had a stable uncharged cousin of 
similar mass—say, a few hundred times that 
of a proton—then just enough of those heavy, 
aloof particles should linger from the big 
bang to provide the unseen mass whose grav- 
ity holds galaxies and other cosmic structures 
together. The WIMP was born. 

Inspired by that “WIMP miracle,” theorists 
realized that a concept called supersymmetry 
also predicts WIMPs. Supersymmetry posits 
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for every known particle a more massive part- 
ner with a different spin. In many versions 
of supersymmetry the lightest superpartner 
is stable, neutral, and interacts through only 
the weak force, just like a WIMP. That con- 
nection made the WIMP one of two leading 
candidates for dark matter particles. (The 
other is the axion, a hypothetical particle 
born of the theory of the strong nuclear force 
[Science, 1 November 2013, p. 552].) 

Physicists envision detecting WIMPs 
in three ways (Science, 6 July 2007, p. 32). 
First, they hope to spot them directly as they 
crash into atomic nuclei in ultrasensitive 
detectors like XENONIT, placed deep under- 
ground where they’re shielded from ordinary 
radiation. Second, they hope to produce them 
and other supersymmetric particles at the 
world’s biggest atom-smasher, CERN’s Large 
Hadron Collider (LHC). Third, they hope to 
spot telltale radiation from WIMPs colliding 
and annihilating one another in the center 
of our galaxy, as supersymmetry predicts 
they might. 

When it comes to underground detectors, 
bigger is better, as a more massive experi- 
ment provides more nuclei for WIMPs to hit. 
Around the world, about a dozen teams are 
working on detectors of various types. The 
current sensitivity record belongs to the 
Large Underground Xenon (LUX) detector 


sciencemag.org SCIENCE 


PHOTO: ENRICO SACCHETTI 


Downloaded from on March 25, 2016 


NEWS 


DIAGRAM: L. BAUDIS ET AL., PHYSICS OF THE DARK UNIVERSE 4, (SEPTEMBER 2014) © ELSEVIER/ADAPTED BY H. BISHOP/SCIENCE 


This battery of photodetectors will look for flashes 
of light from WIMPs colliding with xenon nuclei. 


at the Sanford Underground Research Facil- 
ity in Lead, South Dakota, which contains 
370 kilograms of liquid xenon. In 2014 LUX 
leapfrogged the previous, second version of 
XENON, which held 62 kilograms of liquid. 
XENONIT—which will actually contain 
3.5 metric tons of liquid xenon—will vault 
into the lead when it starts taking data this 
spring, as it should be 100 times more sensi- 
tive than LUX. 

But physicists are anxious about its 
prospects. Their main concern is that in 6 
years of running, the LHC has found no evi- 
dence of supersymmetry—the foundation of 
the WIMP model. It’s not too late, as until last 
year the LHC ran at only half energy. Still, 
“the 800-pound gorilla in the room is the lack 
of any sign of supersymmetry coming from 
the LHC,’ says Juan Collar, an experimental- 
ist at UC. 

To hedge their bets, dark matter research- 
ers are working on even bigger detectors. 
XENONIT is designed so that in 2 years it 
can be expanded to create XENONNT, which 
would hold 7.5 metric tons of xenon. And re- 
searchers with LUX are building a detector 
called LZ that should hold 10 metric tons of 
xenon and would come on line in 2019. 

The drive for bigger WIMP detectors 
can’t go on forever, physicists say. Detectors 
100 times more sensitive than XENONIT 
would suffer from interference generated by 
neutrinos. Streaming from the sun and other 
sources, these particles will start to make 
confounding WIMP-like signals in the detec- 
tors. Hitting that neutrino background marks 
the end of the road, physicists say. 

Physicists in Europe already are planning 


to push to that limit. Laura Baudis, an experi- 
mentalist at the University of Zurich in Swit- 
zerland and a member of the XENON team, 
leads planning for DARWIN, a detector that 
also would sit in Gran Sasso and would con- 
tain 50 tons of liquid xenon. DARWIN would 
cost roughly $60 million, and Baudis says the 
team hopes to get the go-ahead for construc- 
tion in 2020. 

Other physicists seem less than keen on 
the project. “What comes after [XENONIT] 
is when you take a broom and make sure 
you haven’t missed anything in the corners,” 
says Rafael Lang, a XENON team member at 
Purdue University in West Lafayette, Indiana. 
UC’s Collar says, “I don’t know how many 
people would be interested in working on 
this. I know I certainly wouldn’t be.” 

But Baudis stresses that DARWIN would 
also pursue key experiments in neutrino 
physics. “I wouldn’t do it only to close the 
gap on the WIMP,” she says. The detector 
could search for a hypothesized nuclear de- 
cay called neutrinoless double B decay that 
would prove that the neutrino is its own anti- 
particle. It could measure the spectrum of 
solar neutrinos to high precision. 

Meanwhile, some experimenters think 
it’s time to pursue other dark matter candi- 
dates. Collar is working on a small detector 
for more-speculative strongly interacting 
massive particles, which would interact with 
ordinary matter too strongly to reach under- 
ground detectors. But many other proposed 
dark matter particles lack a connection to 
known physics like the WIMP miracle. That 
makes it hard to know what sort of experi- 
ment to build, Aprile says. “There are too 
many ideas,” she says. If the WIMP doesn’t 
show up soon, dark matter hunters may not 
know what to look for next. 


End of the road for WIMP searches 


The quest for weakly interacting massive particles (WIMPs) likely will end when increasingly sensitive detectors 


run into interference from neutrinos. They've already ruled out some of the range for supersymmetric WIMPs. 
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INFECTIOUS DISEASE 


Don't blame 
sports for 
Zika’s spread 


Viral genomes suggest 
ordinary travel brought the 
disease to Brazil in late 2013 


By Gretchen Vogel 


orld Cup fans are probably not to 

blame for bringing Zika virus to 

Brazil in June and July 2014. And, 

contrary to other speculation, nei- 

ther are the teams that attended a 

championship canoe race in Sep- 
tember 2014. According to a new genome 
analysis, the virus, first detected in north- 
eastern Brazil in March 2015, had likely been 
spreading there long before either event, 
having arrived sometime between May and 
December 2013. The researchers say it most 
likely slipped in on one of thousands of air- 
line flights from French Polynesia or South- 
east Asia. 

That is an important insight, says 
Matthew Ferrari, an epidemiologist and 
modeler at Pennsylvania State University, 
University Park, who wasn’t involved in the 
study. “Starting with [one-time events] as 
hypotheses can be distracting,” he says. 
“The genome data suggest an entirely 
different timeline.” 

More than 50 researchers from Brazil, the 
United Kingdom, Canada, and the United 
States collaborated on the study, published 
online today in Science. The scientists se- 
quenced the full genome of virus samples 
taken from seven Brazilian patients who 
were infected with Zika between March and 
November 2015. The virus has now raced 
through the Americas, with local transmis- 
sion reported in at least 33 countries. It usu- 
ally causes only mild symptoms, but it has 
recently been linked to a striking increase in 
babies born with microcephaly, correspond- 
ing brain damage, and other neurological ef- 
fects in adults. 

Four of the virus samples came from pa- 
tients with uncomplicated cases. One sample 
was from a baby born with microcephaly, 
who died shortly after birth. Another was 
taken from a 35-year-old man who died from 
complications of encephalitis after being in- 
fected with Zika. And the last came from a 
blood donor who developed a rash—a typi- 
1377 
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cal symptom of Zika—2 days after donating 
blood. The scientists looked for, but did not 
find, genetic signatures that would point to 
a mutation that might be fueling the virus’s 
rapid spread or the serious complications. 

To try to retrace the virus’s route, the sci- 
entists compared the genomes of the Brazil- 
ian samples to those from patients in nine 
other countries, six from the current out- 
break in the Americas and one each from 
French Polynesia, the Cook Islands, and 
Thailand. The sequences from the Americas 
were the most closely related; the sequence 
from a patient in Thailand in 2013 was the 
most distant. That’s consistent with the lead- 
ing theory that the virus entered Brazil only 
once, from someone infected in the 2013 Zika 
epidemic in French Polynesia, and spread 
to the rest of the Americas from there, says 
Oliver Pybus, an evolutionary biologist at the 
University of Oxford in the United Kingdom, 
and a co-author on the paper. 

It could have arrived, the authors say, dur- 
ing the Confederations Cup soccer tourna- 
ment in late June 2013. That event brought 
the Tahitian national team to a stadium in 
Recife, near the epicenter of the Brazilian 
epidemic. (Tahiti lost to Uruguay, 8-0.) But 
that was several months before cases of Zika 
were reported in Tahiti, and Pybus thinks it’s 
more fruitful to look at broader travel pat- 
terns rather than discrete events. 

“If we can map flows of people and ani- 
mals,” researchers might be able to find pat- 
terns that could help forecast outbreaks, 
Pybus says. “No amount of looking at indi- 
vidual events is ever going to do that for us.” 
He and his co-authors calculated that dur- 
ing 2013, air travel from Zika-endemic ar- 
eas to Brazil increased by almost 50%, from 
roughly 3500 passengers arriving per month 
to nearly 5000. 

Although researchers tend to focus on the 
well-studied outbreak in French Polynesia, 
other Zika-endemic countries have much 
larger populations and send more travelers 
to Brazil, Pybus notes. More than 1000 air- 
line passengers arrived from the Philippines 
per month in 2013; Indonesia and Thailand 
sent similar numbers. It’s plausible, he says, 
that travelers brought the virus directly from 
Southeast Asia to Brazil, and not from French 
Polynesia. Scott Weaver from the University 
of Texas Medical Branch in Galveston agrees. 
“The Philippines is a very likely source, it 
just hasn’t been sampled,” he says. 

Scientists need more virus genomes from 
those countries to sort out the route Zika 
took to Brazil, Pybus says. “We have a bit of 
a black hole when it comes to understand- 
ing Zika transmission in Southeast Asia.” 
The Tahitian team playing in Recife “is a 
great story,’ Pybus says, “but who knows if 
it’s true.” 
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Comb jellies such as Mnemiopsis leidyi have a through-gut, challenging when this evolutionary innovation arose. 


EVOLUTION 


Comb jelly ‘anus’ guts ideas 
on origin of through-gut 


Videos of captive marine creatures unexpectedly show 
jellies defecate from pores, not via their mouth 


By Amy Maxmen 


o buts about it, the butthole is one 
of the finest innovations in the past 
540 million years of animal evolu- 
tion. The first animals that arose 
seem to have literally had potty 
mouths: Their modern-day descen- 
dants, such as sea sponges, sea anemones, 
and jellyfish, all lack an 
anus and must eat and 
excrete through the same 
hole. Once an indepen- 


“Looks like 've been 


lies, or ctenophores, now threaten to upend 
the standard view of the evolution of the 
so-called through-gut. On 15 March, at the 
Ctenopolooza meeting in St. Augustine, 
Florida, evolutionary biologist William 
Browne of the University of Miami in Flor- 
ida debuted films of comb jellies pooping— 
and it wasn’t through their mouths. 
Browne’s videos elicited gasps from 
the audience because 
comb jellies, whose lin- 
eage evolved long be- 
fore other animals with 


dent exit evolved, how- wrong for 30 years. through-guts, had been 
ever, animals diversified thought to eat and ex- 
into the majority of spe- °° If people don’t see crete through a single 
cies alive today, rang- [his video, they wont hole leading to a saclike 
ing from earthworms believe i t ” gut. In 1880, the Ger- 
to humans. . man zoologist Carl Chun 


One apparent advan- 
tage of a second hole 
is that animals can eat 
while digesting a meal, whereas creatures 
with one hole must finish and defecate be- 
fore eating again. Other possible benefits, 
say evolutionary biologists, include not pol- 
luting an animal’s dining area and allowing 
an animal to evolve a longer body because 
it does not have to pump waste back up to- 
ward the head. 

However, several unprecedented videos 
of gelatinous sea creatures called comb jel- 
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George Matsumoto, Monterey 
Bay Aquarium Research Institute 


suggested a pair of tiny 
pores opposite the comb 
jelly mouth might se- 
crete some substance, but he also con- 
firmed that the animals defecate through 
their mouths. In 1997, biologists again ob- 
served indigestible matter exiting the comb 
jelly mouth—not the mysterious pores. 
Browne, however, used a_ sophisti- 
cated video setup to continuously moni- 
tor two species that he keeps in captivity, 
Mnemiopsis leidyi and Pleurobrachia 
bachei. The movies he played at Ctenop- 
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olooza capture the creatures as they ingest 
tiny crustaceans and zebrafish genetically 
engineered to glow red with fluorescent 
protein. Because comb jellies are translu- 
cent, the prey can be seen as it circulates 
through a network of canals lacing the jel- 
lies’ bodies. Fast-forward, and 2 to 3 hours 
later, indigestible particles exit through the 
pores on the rear end. Browne also pre- 
sented a close-up image of the pores, high- 
lighting a ring of muscles surrounding each 
one. “This is a sphincterlike hole,” he told 
the audience. 

“Looks like ’ve been wrong for 30 years,” 
said George Matsumoto, a marine bio- 
logist at Monterey Bay Aquarium Research 
Institute in Moss Landing, California, after 
he saw Browne’s talk. “If people don’t see 
this video, they won’t believe it,” he added. 
Matsumoto said he, as well as the bio- 
logists before him, likely missed the bowel 
movements because they did not observe 
their animals long enough after a meal. 
Jellies seen to expel waste from their 
mouths might have been, in effect, vomit- 
ing because they were fed too much, or the 
wrong thing. 

According to recent DNA analyses, comb 
jellies evolved earlier than other animals 
considered to have one hole, including 
sea anemones, jellyfish, and possibly sea 
sponges. (Some studies suggest sponges 
arose first.) Consequently, Browne’s as-yet 
unpublished findings disrupt the step-wise 
progression of digestive anatomy from one 
to two holes early in animal evolution. 

One possibility is that the comb jellies 
evolved through-guts and anuslike pores 
on their own, independent of all other ani- 
mals, over hundreds of millions of years. 
Alternatively, a through-gut and exit hole 
may have evolved once in an ancient ani- 
mal ancestor, and subsequently became 
lost in anemones, jellyfish, and sponges. 
Perhaps if you're an anemone or a sponge 
stuck to a rock, suggests Matsumoto, it’s 
better to push waste back into the current 
rather than below. 

Browne is currently exploring the latter 
theory by seeing whether comb jellies acti- 
vate the same genes when developing their 
pores that other animals do when growing 
an anus. If he finds that the genes are dif- 
ferent, the evolution of our most unspeak- 
able body part will no longer be considered 
the singular event zoologists long sup- 
posed. “We have all these traditional no- 
tions of a ladderlike view of evolution, and 
it keeps getting shaken,” says Kevin Kocot, 
an evolutionary biologist at the University 
of Alabama, Tuscaloosa. & 


Amy Maxmen is a writer based in 
Berkeley, California. 
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SYNTHETIC BIOLOGY 


Meet Syn 3 
its record-setting sma 
number of genes, 47 


Synthetic microbe has fewest 
genes, but many mysteries 


One-third of 473 genes in microbe have unknown functions 


By Robert F. Service 


hen it comes to genome size, a 

rare Japanese flower, called Paris 

japonica, is the current heavy- 

weight champ, with 50 times 

more DNA than humans. At the 

other end of the scale, there’s now 
a new lightweight record-holder growing 
in petri dishes in California. This week in 
Science, researchers led by genome sequenc- 
ing pioneer Craig Venter report engineering 
a bacterium to have the smallest genome— 
and the fewest genes—of any freely living 
organism, smaller than the flower’s by a fac- 
tor of 282,000. Known as Syn 3.0, the new 
organism has a genome whittled down to 
the bare essentials needed to survive and 
reproduce, just 473 genes. “It’s a tour de 
force,’ says George Church, a synthetic bio- 
logist at Harvard University. 

The microbe’s streamlined genetic struc- 
ture excites evolutionary biologists and 
biotechnologists, who anticipate adding 
genes back to it one by one to study their 
effects. “It’s an important step to creat- 
ing a living cell where the genome is fully 
defined,” says synthetic biologist Chris Voigt 
of the Massachusetts Institute of Techno- 
logy in Cambridge. But Voigt and others 
note that this complete definition remains 
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a ways off, because the function of 149 
of Syn 3.0’s genes—roughly one-third— 
remains unknown. Investigators’ first task 
is to probe the roles of those genes, which 
promise new insights into the basic biology 
of life. 

As Syn 3.0’s name suggests, it’s not the 
first synthetic life made by Venter, who 
heads the J. Craig Venter Institute (JCVI) 
and is a founder of Synthetic Genomics, a 
biotech company, both in San Diego, Cali- 
fornia. In 2010, Venter’s team reported 
that they had synthesized the sole chromo- 
some of Mycoplasma mycoides—a bacte- 
rium with a relatively small genome—and 
transplanted it into a separate mycoplasma 
called M. capricolum, from which they had 
previously extracted the DNA. After several 
false starts, they showed that the synthetic 
microbe booted up and synthesized pro- 
teins normally made by M. mycoides rather 
than M. capricolum (Science, 21 May 2010, 
p. 958). Still, other than adding a bit of 
watermark DNA, the researchers left 
the genetic material in their initial syn- 
thetic organism, Syn 1.0, unchanged from 
the parent. 

In their current work, Venter, along with 
project leader Clyde Hutchison at JCVI, set 
out to determine the minimal set of genes 
needed for life by stripping nonessential 
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genes from Syn 1.0. They initially formed 
two teams, each with the same task: using 
all available genomic knowledge to design a 
bacterial chromosome with the hypotheti- 
cal minimum genome. Both proposals were 
then synthesized and transplanted into 
M. capricolum to see whether either would 
produce a viable organism. 

“The big news is we failed,” Venter says. 
“T was surprised.” Neither chromosome pro- 
duced a living microbe. It’s clear, Venter 
says, that “our current knowledge of bio- 
logy is not sufficient to sit down and design 
a living organism and build it.” 

Venter and his colleagues had better 
success with trial and error. They divided 
Syn 1.0’s genome, with its 901 genes, into 
eight sections. To the beginning and end of 
each section they added identical DNA tags 
that made the pieces easy to reassemble. 
That allowed them to treat the sections as 
independent modules, removing each one 
in turn, deleting chunks of DNA, then re- 
assembling the full genome and reinsert- 
ing it into M. capricolum to see whether it 
produced a living cell. If the altered genome 
wasn’t viable, they knew they had cut out 
an essential gene that had to be restored. 
The researchers also assessed the necessity 
of numerous genes in the microbe by insert- 
ing foreign genetic material, called transpo- 
sons, to disrupt their function. 

All this enabled them to systematically 
whittle away genes that either had non- 
essential functions or duplicated the func- 
tion of another gene. In the end, Venter 
says, his team built, designed, and tested 
“multiple hundreds” of constructs before 
settling on Syn 3.0, with a genome about 
half the size of Syn 1.0’s. (Syn 2.0 was an 
intermediate stage in this process, the first 
microbe with a genome smaller than that of 
M. genitalium, which with 525 genes has the 
fewest of any free-living natural organism.) 

Once the whittling was complete, the re- 
searchers reordered the remaining genes, 
aligning ones that work in common path- 
ways. The procedure tidied up the genome 
much as a computer compresses and re- 
organizes files on its hard drive to save disk 
space. This will likely make life much easier 
for synthetic biologists who will experiment 
with Syn 3.0 in the future, Voigt says. 

With a total of 531,000 bases, the new or- 
ganism’s genome isn’t much smaller than 
that of M. genitaliwm, with 600,000 bases. 
But M. genitalium grows so slowly that a 
population of cells can take weeks to double. 
Syn 3.0, by contrast, has a doubling time 
of 3 hours, suggesting that it thrives with 
its slimmed down genome. “We're not say- 
ing this is the ultimate minimum genome,” 
Venter says. For now, however, Syn 3.0 reigns 
as the world’s new lightweight champ. 
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Turkish scholar who eluded 
arrest describes ‘witch hunt’ 


Three of Meral Camcr’s fellow academics are imprisoned 
for criticizing the government; more arrests may follow 


By John Bohannon 


he simmering war between Turkish 
academics and their increasingly re- 
pressive government came to a boil 
last week with arrests and an escape. 
It began in January with the firing of 
dozens of academics, many of them 
scientists, from Turkey’s universities. All had 
signed an online petition by a group call- 
ing itself Academics for Peace that is criti- 
cal of the government’s 
treatment of the Kurdish 
minority group. The fir- 


universities, at least 23 faculty members 
have been sacked, one was forced to retire, 
and 62 face investigations. 


Q: Why were you singled out for arrest? 

A: I was one of the four who read a press 
release on the 10th of March on behalf 

of Academics for Peace in Istanbul. This 
might have been the reason why they chose 
us first, but the criminal charges were 
based on the peace petition, which was 
titled “We will not be a 
party to this crime!” 


6 ° 
ings sparked protests and Tam not afr aid Q: What motivated you to 
statements from ieee of anyone or any ra the dente _ 
organizations, including ° ° ° : I wanted to make a 
the U.S. National Acade- institution to use contribution to the strug- 
mies of Science, Engineer- my constitutionally gle for peace and demo- 
ing, and Medicine, calling arantee d ri h ts ” cratic rights in my country, 
on the government to re- gu 4 ° including the freedom of 


spect freedom of speech. 

The standoff held until 
13 March, when Kurdish separatists set off 
a car bomb in the capital, killing 37 people. 
The next day Turkey’s president, Recep 
Tayyip Erdogan, announced that the defi- 
nition of “terrorism” should be expanded 
to include all who provide support in the 
form of “propaganda” and specifically called 
out academics. 

Within hours of the president’s speech, 
police arrived at the homes of four Turk- 
ish researchers. Three are now imprisoned; 
Turkish academics fear that many more ar- 
rests will follow. But Meral Camci, a literary 
scholar who had been dismissed from her 
faculty position at Yeni Yiizyil University in 
Istanbul, had gone to France on vacation 
just days before. 

Science contacted Camci through Turkish 
expat researchers. This interview has been 
edited for brevity and clarity. 


Meral Camci 


Q: More than 2000 scholars have signed the 
petition. What consequences have 

they suffered? 

A: At Turkey’s 109 public universities, there 
have been at least 10 dismissals, five resig- 
nations, 471 disciplinary investigations, 

27 suspensions, 156 criminal investigations, 
and 35 detentions. And at the 84 private 
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expression. What I could 
do was to sign a petition, 
which spoke out against the “unacceptable 
deeds” of the state in the Kurdish regions 
of the country, and called for the state to 
take immediate steps toward peace. As for 
the [10 March] press release, what I can say 
is that after all the retaliations academ- 
ics have faced, we wanted to emphasize 
that we are still standing for peace. [We 
also wanted] to make clear what we have 
been through at the universities—a kind 
of witch hunt carried out with dismiss- 
als, forced resignations, and disciplinary 
investigations. I am not afraid of anyone 
or any institution to use my constitution- 
ally guaranteed human rights: freedom of 
speech and expression, freedom of thought, 
and my right to share and publish my 
peaceful demands. 


Q: When can you safely return home? 
A: I cannot say at the moment. But I am 
not seeking political asylum. 


Q: What is your next move? 

A: Our attorneys are trying to get the deci- 
sion of the court canceled. There are no 
valid grounds under Turkish law. We have 
to denounce these unlawful and unjust ar- 
rests of our three colleagues. 
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CHINA 


Five-year plan boosts basic research funding 


Blueprint gives few details, but scientists foresee more generous grants and new facilities 


By Hao Xin, in Beijing 


hina’s economic slowdown could bring 
a windfall for basic science. Cosmic 
evolution, the structure of matter, the 
origins of life, and understanding how 
the brain works all deserve strength- 
ened support, according to China’s 
latest 5-year development plan, which could 
triple funding for basic research by 2020. 

An outline of the plan, which covers 2016 
through 2020, received pro forma approval 
by the National People’s Congress (NPC) on 
16 March at its closing session. The plan 
signals that top leaders are looking 
to researchers, even those doing 
fundamental work, for innovations 
that will drive the economy as it 
undergoes structural reforms. “Writ- 
ing ‘cosmic evolution’ into the plan 
illustrates the importance the coun- 
try places on science,” Zheng Yong- 
Chun, a researcher at the National 
Astronomical Observatory (NAO) in 
Beijing, wrote in People’s Daily, the 
official Communist Party newspaper. 

Though details are still scarce, 
Chinese science leaders familiar 
with the plan expect that funding for 
basic research will rise to 10% of to- 
tal R&D spending by 2020, up from 
less than 5% now. That falls short of 
a goal of devoting 15% of total R&D 
expenditures to fundamental sci- 
ence by 2020, set in a mid- to long-term sci- 
ence and technology plan adopted in 2005 
(Science, 17 March 2006, p. 1548). But if the 
10% goal is achieved, investment in basic re- 
search could hit 225 billion yuan, or about 
$34.5 billion, in 2020, compared with last 
year’s $10 billion. 

The National Natural Science Founda- 
tion of China (NSFC) in Beijing, which funds 
most peer-reviewed grants to individuals and 
small groups, expects its budget to increase 
an average of 10% annually, NSFC President 
Yang Wei told Chinese media. That would 
be half the 20%-plus average annual growth 
NSFC has enjoyed since it was established 
in 1986, but would still boost its appropria- 
tion from $3.7 billion this year to about 
$5.4 billion by 2020. Yang said the increase 
will, among other things, allow the founda- 
tion to double the size of grants awarded un- 
der its outstanding young scientist program. 

The Chinese Academy of Sciences (CAS), 
a bastion of basic research, and the Ministry 
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of Science and Technology (MOST), which 
primarily supports applied research, can 
also expect hefty increases under the new 
5-year plan. CAS is holding expert meetings 
to help it decide which programs to support, 
according to its website. MOST has already 
called for proposals in nine areas, including 
precision medicine, reproductive health, bio- 
medical materials, global change, and cloud 
computing and big data mining. 

New big science projects, too, are vying 
for a share of the increased funding. After 
the U.S.-based Advanced Laser Interfero- 
meter Gravitational-Wave Observatory an- 


China plans to search for primordial gravitational waves with a new 
microwave telescope, to be built at the 5100-meter-high Shiquanhe 
Observatory in Ngari, Tibet. 


nounced last month that it had detected 
a gravitational wave from merging black 
holes, Chinese President Xi Jinping asked 
Chinese scientists what they could con- 
tribute to the field. Jumping at the chance, 
Zhongshan University in Guangzhou an- 
nounced its Tianqin program to build a 
space-based gravitational-wave detector, 
which prompted a CAS group to announce 
its own competing Taiji Program in Space. 
Both projects call for multiple satellites and 
could cost billions of dollars. 

Apparently neither has the inside track. 
Instead, CAS President Bai Chunli said at 
a news conference on the sidelines of the 
congress that the academy plans to build a 
microwave telescope at the Shiquanhe Ob- 
servatory in Ngari, Tibet, which could de- 
tect the imprint of primordial gravitational 
waves on the cosmic microwave background. 
That is also the goal of Western projects such 
as Background Imaging of Cosmic Extraga- 
lactic Polarization 2, which uses a telescope 
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at the South Pole and made a premature de- 
tection claim 2 years ago. Some in the Chi- 
nese scientific community have suggested 
that the Ngari project should enlist interna- 
tional collaborators. 

For one high-profile project the news is 
not as good. China plans to hold off on con- 
struction of the Circular Electron Positron 
Collider (CEPC), intended to generate large 
numbers of Higgs bosons to precisely mea- 
sure the particle’s mass. The project would 
cost somewhere between $3.8 billion and 
$5.4 billion, depending on its circumference. 
Wang Yifang, director of CAS’s Institute 
of High Energy Physics in Beijing, 
the chief sponsor of the CEPC, says 
the project continues to get 
R&D funding. 

Some Chinese scientists say the 
country still has lessons to learn 
about how to support and run big 
science projects. A case in point is 
the Large Sky Area Multi-Object 
Fiber Spectroscopic Telescope (LA- 
MOST) in Xinglong Station (Science, 
4 April 2008, p. 34). Because of a de- 
sign oversight, LAMOST’s astronomi- 
cal seeing—a measure of how blurred 
a point source appears—is not up to 
spec, says Lou Yu-Qing, an astrophys- 
icist at Tsinghua University herein 
Beijing. Heat turbulence inside the 
telescope’s white dome has degraded 
LAMOST’s seeing from an expected 
2 arcseconds to nearly 4 arcseconds or 
worse, Lou says. As a result, the telescope is 
limited to making observations within the 
Milky Way, although its original scientific 
goal was to reach beyond for extragalactic 
surveys. LAMOST does not have the budget 
or expertise to fix such problems. 

As with most big science facilities, the cen- 
tral government footed the telescope’s con- 
struction cost of about $40 million. But it 
provided no funds for operations. CAS pays 
for utilities and maintenance but not per- 
sonnel. According to Chinese media reports, 
LAMOST’s chief scientist, Cui Xiangqun, 
lamented while attending the NPC that the 
facility has been borrowing money from the 
NAO to cover staff salaries. 

A lack of support for ongoing research af- 
flicts many of China’s big science facilities. 
But the situation may soon change, now that 
“leaders want scientific results, too,’ says a 
scientist who requested anonymity, “and 
they want them fast.” 
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SLAUGHTER 


AT THE BRIDGE 


Grisly find suggests Bronze Age northern Europe 
was more organized—and violent—than thought 


By Andrew Curry, in Liibstorf, Germany 


The flint arrowhead embedded in 
this upper arm bone first alerted 
archaeologists to the long-ago 
violence in the Tollense Valley. 


1384 25 MARCH 2016 + VOL 351 ISSUE 6280 sciencemag.org SCIENCE 


PHOTO: LANDESAMT FUR KULTUR UND DENKMALPFLEGE MECKLENBURG-VORPOMMERN/LANDESARCHAOLOGIE/S. SUHR 


Downloaded from on March 25, 2016 


PHOTO: LANDESAMT FUR KULTUR UND DENKMALPFLEGE MECKLENBURG-VORPOMMERN/LANDESARCHAOLOGIE/C. HARTL-REITER 


bout 3200 years ago, two armies 

clashed at a river crossing near 

the Baltic Sea. The confronta- 

tion can’t be found in any history 

books—the written word didn’t 

become common in these parts for 

another 2000 years—but this was 

no skirmish between local clans. 

Thousands of warriors came to- 

gether in a brutal struggle, perhaps fought 

on a single day, using weapons crafted from 

wood, flint, and bronze, a metal that was 
then the height of military technology. 

Struggling to find solid footing on the 

banks of the Tollense River, a narrow ribbon 

of water that flows through the marshes of 

northern Germany toward the Baltic Sea, 

the armies fought hand-to-hand, maiming 

and killing with war clubs, spears, swords, 

and knives. Bronze- and flint-tipped arrows 


a single upper arm bone sticking out of the 
steep riverbank—the first clue that the Tol- 
lense Valley, about 120 kilometers north of 
Berlin, concealed a gruesome secret. A flint 
arrowhead was firmly embedded in one 
end of the bone, prompting archaeologists 
to dig a small test excavation that yielded 
more bones, a bashed-in skull, and a 73- 
centimeter club resembling a baseball bat. 
The artifacts all were radiocarbon-dated to 
about 1250 B.C.E., suggesting they stemmed 
from a single episode during Europe’s 
Bronze Age. 

Now, after a series of excavations between 
2009 and 2015, researchers have begun to 
understand the battle and its startling im- 
plications for Bronze Age society. Along a 
3-kilometer stretch of the Tollense River, 
archaeologists from the Mecklenburg- 
Vorpommern Department of Historic Pres- 


Saxony State Service for Cultural Heritage 
in Hannover. “There’s nothing to com- 
pare it to.” It may even be the earliest di- 
rect evidence—with weapons and warriors 
together—of a battle this size anywhere in 
the ancient world. 

Northern Europe in the Bronze Age 
was long dismissed as a backwater, over- 
shadowed by more sophisticated civiliza- 
tions in the Near East and Greece. Bronze 
itself, created in the Near East around 
3200 B.C.E., took 1000 years to arrive 
here. But Tollense’s scale suggests more 
organization—and more violence—than 
once thought. “We had considered sce- 
narios of raids, with small groups of young 
men killing and stealing food, but to imag- 
ine such a big battle with thousands of peo- 
ple is very surprising,” says Svend Hansen, 
head of the German Archaeological Insti- 


Bones were closely packed in some parts of the excavation, as seen in this photo from 2013. One area of 12 square meters held 1478 bones, including 20 skulls. 


were loosed at close range, piercing skulls 
and lodging deep into the bones of young 
men. Horses belonging to high-ranking 
warriors crumpled into the muck, fatally 
speared. Not everyone stood their ground 
in the melee: Some warriors broke and ran, 
and were struck down from behind. 

When the fighting was through, hundreds 
lay dead, littering the swampy valley. Some 
bodies were stripped of their valuables and 
left bobbing in shallow ponds; others sank 
to the bottom, protected from plundering by 
a meter or two of water. Peat slowly settled 
over the bones. Within centuries, the entire 
battle was forgotten. 

In 1996, an amateur archaeologist found 
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ervation (MVDHP) and the University of 
Greifswald (UG) have unearthed wooden 
clubs, bronze spearheads, and flint and 
bronze arrowheads. They have also found 
bones in extraordinary numbers: the re- 
mains of at least five horses and 
more than 100 men. Bones from 
hundreds more may remain un- 
excavated, and thousands of oth- 
ers may have fought but survived. 

“If our hypothesis is correct 
that all of the finds belong to the 
same event, we're dealing with a conflict of 
a scale hitherto completely unknown north 
of the Alps,’ says dig co-director Thomas 
Terberger, an archaeologist at the Lower 
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tute’s (DAIT’s) Eurasia Department in Ber- 
lin. The well-preserved bones and artifacts 
add detail to this picture of Bronze Age so- 
phistication, pointing to the existence of a 
trained warrior class and suggesting that 

people from across Europe joined 


the bloody fray. 


There’s little disagreement 
now that Tollense is something 
special. “When it comes to the 
Bronze Age, we’ve been missing 
a smoking gun, where we have a 

battlefield and dead people and weapons 
all together,” says University College Dublin 
(UCD) archaeologist Barry Molloy. “This is 
that smoking gun.” 
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THE LAKESIDE HUNTING LODGE called 
Schloss Wiligrad was built at the turn of the 
19th century, deep in a forest 14 kilometers 
north of Schwerin, the capital of the northern 
German state of Mecklenburg-Vorpommern. 
Today, the drafty pile is home to both the 
state’s department of historic preservation 
and a small local art museum. 

In a high-ceilinged chamber on the cas- 
tle’s second floor, tall windows look out ona 
fog-shrouded lake. Inside, pale winter light 
illuminates dozens of skulls arranged on 
shelves and tables. In the center of the room, 
long leg bones and short ribs lie in serried 
ranks on tables; more remains are stored in 
cardboard boxes stacked on metal shelves 
reaching almost to the ceiling. The bones 
take up so much space there’s barely room 
to walk. 

When the first of these finds was exca- 
vated in 1996, it wasn’t even 
clear that Tollense was a 
battlefield. Some archaeo- 
logists suggested the skele- 
tons might be from a flooded 
cemetery, or that they had 
accumulated over centuries. 

There was reason for skep- 
ticism. Before Tollense, di- 
rect evidence of large-scale 
violence in the Bronze Age 
was scanty, especially in this 
region. Historical accounts 
from the Near East and 
Greece described epic battles, 
but few artifacts remained 
to corroborate these boast- 
ful accounts. “Even in Egypt, 
despite hearing many tales of 
war, we never find such sub- 
stantial archaeological evi- 
dence of its participants and 
victims,’ UCD’s Molloy says. 

In Bronze Age Europe, 
even the historical accounts 
of war were lacking, and all 
investigators had to go on 
were weapons in ceremonial 
burials and a handful of mass graves with 
unmistakable evidence of violence, such 
as decapitated bodies or arrowheads em- 
bedded in bones. Before the 1990s, “for a 
long time we didn’t really believe in war in 
prehistory,’ DAI’s Hansen says. The grave 
goods were explained as prestige objects or 
symbols of power rather than actual weap- 
ons. “Most people thought ancient society 
was peaceful, and that Bronze Age males 
were concerned with trading and so on,” 
says Helle Vandkilde, an archaeologist at 
Aarhus University in Denmark. “Very few 
talked about warfare.” 

The 10,000 bones in this room—what’s 
left of Tollense’s losers—changed all that. 
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They were found in dense caches: In one 
spot, 1478 bones, among them 20 skulls, 
were packed into an area of just 12 square 
meters. Archaeologists think the bodies 
landed or were dumped in shallow ponds, 
where the motion of the water mixed up 
bones from different individuals. By count- 
ing specific, singular bones—skulls and 
femurs, for example—UG forensic anthropo- 
logists Ute Brinker and Annemarie 
Schramm identified a minimum of 130 in- 
dividuals, almost all of them men, most be- 
tween the ages of 20 and 30. 

The number suggests the scale of the 
battle. “We have 130 people, minimum, 
and five horses. And we’ve only opened 
450 square meters. That’s 10% of the find 
layer, at most, maybe just 3% or 4%,” says 
Detlef Jantzen, chief archaeologist at 
MVDHP. “If we excavated the whole area, 


Today’s peaceful meanders of the Tollense River once were the site of bitter fighting. 


we might have 750 people. That’s incred- 
ible for the Bronze Age.” In what they ad- 
mit are back-of-the-envelope estimates, he 
and Terberger argue that if one in five of 
the battle’s participants was killed and left 
on the battlefield, that could mean almost 
4000 warriors took part in the fighting. 
Brinker, the forensic anthropologist in 
charge of analyzing the remains, says the 
wetness and chemical composition of the 
Tollense Valley’s soil preserved the bones 
almost perfectly. “We can reconstruct ex- 
actly what happened,” she says, picking up 
a rib with two tiny, V-shaped cuts on one 
edge. “These cut marks on the rib show he 
was stabbed twice in the same place. We 
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have a lot of them, often multiple marks on 
the same rib.” 

Scanning the bones using microscopic 
computer tomography at a materials science 
institute in Berlin and the University of Ros- 
tock has yielded detailed, 3D images of these 
injuries. Now, archaeologists are identifying 
the weapons responsible by matching the 
images to scans of weapons found at Tollense 
or in contemporary graves elsewhere in Eu- 
rope. Diamond-shaped holes in bones, for ex- 
ample, match the distinctive shape of bronze 
arrowheads found on the battlefield. (Bronze 
artifacts are found more often than flint at 
Tollense, perhaps because metal detectors 
were used to comb spoil piles for artifacts.) 

The bone scans have also sharpened the 
picture of how the battle unfolded, Terberger 
says. In x-rays, the upper arm bone with an 
embedded arrowhead—the one that trig- 
gered the discovery of the 
battlefield—seemed to show 
signs of healing. In a 2011 
paper in Antiquity, the team 
suggested that the man sus- 
tained a wound early in the 
battle but was able to fight on 
for days or weeks before dy- 
ing, which could mean that 
the conflict wasn’t a single 
clash but a series of skir- 
mishes that dragged out for 
several weeks. 

Microscopic inspection of 
that wound told a different 
story: What initially looked 
like healing—an opaque lin- 
ing around the arrowhead 
on an x-ray—was, in fact, 
a layer of shattered bone, 
compressed by a single im- 
pact that was probably fatal. 
“That let us revise the idea 
that this took place over 
weeks,” Terberger says. So 
far no bodies show healed 
wounds, making it likely the 
battle happened in just a day, 
or a few at most. “If we are dealing with a 
single event rather than skirmishes over 
several weeks, it has a great impact on our 
interpretation of the scale of the conflict.” 

In the last year, a team of engineers in 
Hamburg has used techniques developed to 
model stresses on aircraft parts to under- 
stand the kinds of blows the soldiers suf- 
fered. For example, archaeologists at first 
thought that a fighter whose femur had 
snapped close to the hip joint must have 
fallen from a horse. The injury resembled 
those that result today from a motorcycle 
crash or equestrian accident. 

But the modeling told a different story. 
Melanie Schwinning and Hella Harten-Buga, 
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University of Hamburg archaeologists and 
engineers, took into account the physical 
properties of bone and Bronze Age weapons, 
along with examples of injuries from horse 
falls. An experimental archaeologist also 
plunged recreated flint and bronze points 
into dead pigs and recorded the damage. 

Schwinning and Harten-Buga say a 
bronze spearhead hitting the bone at a 
sharp downward angle would have been 
able to wedge the femur apart, cracking it 
in half like a log. “When we modeled it, it 
looks a lot more like a handheld weapon 
than a horse fall,’ Schwinning says. “We 
could even recreate the force it would have 
taken—it’s not actually that much.” They 
estimate that an average-sized man driving 
the spear with his body weight would have 
been enough. 

Why the men gathered in this spot to fight 
and die is another mystery that archaeo- 
logical evidence is helping unravel. The 
Tollense Valley here is narrow, just 50 me- 
ters wide in some spots. Parts are swampy, 
whereas others offer firm ground and solid 
footing. The spot may have been a sort of 
choke point for travelers journeying across 
the northern European plain. 

In 2013, geomagnetic surveys revealed 
evidence of a 120-meter-long bridge or 
causeway stretching across the valley. 
Excavated over two dig seasons, the sub- 
merged structure turned out to be made 
of wooden posts and stone. Radiocarbon 
dating showed that although much of the 
structure predated the battle by more than 
500 years, parts of it may have been built 
or restored around the time of the battle, 
suggesting the causeway might have been 
in continuous use for centuries—a well- 
known landmark. 

“The crossing played an important role 
in the conflict. Maybe one group tried to 
cross and the other pushed them back,” 
Terberger says. “The conflict started there 
and turned into fighting along the river.” 

In the aftermath, the victors may have 
stripped valuables from the bodies they 
could reach, then tossed the corpses into 
shallow water, which protected them from 
carnivores and birds. The bones lack the 
gnawing and dragging marks typically left 
by such scavengers. 

Elsewhere, the team found human and 
horse remains buried a meter or two lower, 
about where the Bronze Age riverbed might 
have been. Mixed with these remains were 
gold rings likely worn on the hair, spiral rings 
of tin perhaps worn on the fingers, and tiny 
bronze spirals likely used as decorations. 
These dead must have fallen or been dumped 
into the deeper parts of the river, sinking 
quickly to the bottom, where their valuables 
were out of the grasp of looters. 
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The remains testify to the destructiveness of Bronze Age weapons. A wooden club was likely used to bash in 


i 


skulls (top), whereas bronze arrows bit deep into bone. This arrow penetrated the skull to the brain. 
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Bronze Age battle gear 

To piece together the battle kits of the Tollense 
warriors, archaeologists have analyzed wounds 
on bones plus weapons of wood, flint, and bronze 
left on the battlefield. Men who had horses and 
carried bronze may have been part of an officer 
class, whereas lower ranking grunts wielded flint 
knives and wooden clubs. 


1. Lethal arrowheads Archaeologists have 
found many arrowheads at Tollense, including 
a bronze one embedded in the back of a skull 
and a flint one lodged in an arm bone. 


2. Bronze spearheads Diamond-shaped 
wounds on bones match the shape of bronze 
spearheads found on the battlefield. 


3. Small and sturdy horses The five small 
horses found on the battlefield may have been 
ridden into battle or used as pack animals. 


4. Battle fashion The clothing of the men 
of Tollense either decayed or was looted, but 
other rare finds of garments from this time 
suggest leather belts, cloaks, and wraparound 
garments fashioned like kilts; the men may 
also have worn felted caps or bronze helmets. 
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AT THE TIME OF THE BATTLE, northern 
Europe seems to have been devoid of towns 
or even small villages. As far as archaeo- 
logists can tell, people here were loosely 
connected culturally to Scandinavia and 
lived with their extended families on 
individual farmsteads, with a _ popula- 
tion density of fewer than five people per 
square kilometer. The closest known large 
settlement around this time is more than 
350 kilometers to the southeast, in Waten- 
stedt. It was a landscape not unlike agrar- 
ian parts of Europe today, except without 
roads, telephones, or radio. 

And yet chemical tracers in the remains 
suggest that most of the Tollense warriors 
came from hundreds of kilometers away. 
The isotopes in your teeth reflect those in 
the food and water you ingest during child- 
hood, which in turn mirror the surround- 
ing geology—a marker of where you grew 
up. Retired University of Wisconsin, Madi- 
son, archaeologist Doug Price analyzed 
strontium, oxygen, and carbon isotopes in 
20 teeth from Tollense. Just a few showed 
values typical of the northern European 
plain, which sprawls from Holland to Po- 
land. The other teeth came from farther 
afield, although Price can’t yet pin down ex- 
actly where. “The range of isotope values is 
really large,” he says. “We can make a good 
argument that the dead came from a lot of 
different places.” 

Further clues come from isotopes of an- 
other element, nitrogen, which reflect diet. 
Nitrogen isotopes in teeth from some of the 
men suggest they ate a diet heavy in millet, 
a crop more common at the time in south- 
ern than northern Europe. 

Ancient DNA could potentially reveal 
much more: When compared to other 
Bronze Age samples from around Europe at 
this time, it could point to the homelands 
of the warriors as well as such traits as eye 
and hair color. Genetic analysis is just be- 
ginning, but so far it supports the notion 
of far-flung origins. DNA from teeth sug- 
gests some warriors are related to modern 
southern Europeans and others to people 
living in modern-day Poland and Scandi- 
navia. “This is not a bunch of local idiots,” 
says University of Mainz geneticist Joachim 
Burger. “It’s a highly diverse population.” 

As University of Aarhus’s Vandkilde puts 
it: “It’s an army like the one described in 
Homeric epics, made up of smaller war 
bands that gathered to sack Troy’—an 
event thought to have happened fewer than 
100 years later, in 1184 B.C.E. That suggests 
an unexpectedly widespread social organi- 
zation, Jantzen says. “To organize a battle 
like this over tremendous distances and 
gather all these people in one place was a 
tremendous accomplishment,” he says. 
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The things they carried 


Archaeologists have recovered a wealth of artifacts from the battlefield, offering a 
detailed view of the warriors’ weapons and jewelry. Because many artifacts were found 


with metal detectors, bronze and tin objects abound. 


Tin rings and 
bronze scrolls 
These two tin rings 
may have been worn 
on warriors’ fingers. 
The small bronze 
scrolls may have 
served as tassels or 
as decorations for 
garments. 


Wooden clubs 

Archaeologists found two clubs at Tollense, 
likely both carried by lower ranking men. 
The simple, 73-cm “baseball bat” was made 
of ash and the 62-cm “croquet mallet” was 
crafted of sloe wood. 


So far the team has published only a 
handful of peer-reviewed papers. With ex- 
cavations stopped, pending more funding, 
they’re writing up publications now. But 
archaeologists familiar with the project 
say the implications are dramatic. Tollense 
could force a re-evaluation of the whole 
period in the area from the Baltic to the 
Mediterranean, says archaeologist Kristian 
Kristiansen of the University of Gothen- 
burg in Sweden. “It opens the door to a lot 
of new evidence for the way Bronze Age so- 
cieties were organized,” he says. 

For example, strong evidence suggests 
this wasn’t the first battle for these men. 
Twenty-seven percent of the skeletons 
show signs of healed traumas from earlier 
fights, including three skulls with healed 
fractures. “It’s hard to tell the reason for 
the injuries, but these don’t look like your 
typical young farmers,” Jantzen says. 

Standardized metal weaponry and the re- 
mains of the horses, which were found inter- 
mingled with the human bones at one spot, 
suggest that at least some of the combatants 
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were well-equipped and well-trained. “They 
weren't farmer-soldiers who went out every 
few years to brawl, Terberger says. “These 
are professional fighters.” 

Body armor and shields emerged in 
northern Europe in the centuries just be- 
fore the Tollense conflict and may have 
necessitated a warrior class. “If you fight 
with body armor and helmet and corselet, 
you need daily training or you can’t move,” 
Hansen says. That’s why, for example, the 
biblical David—a shepherd—refused to don 
a suit of armor and bronze helmet before 
fighting Goliath. “This kind of training is 
the beginning of a specialized group of 
warriors,’ Hansen says. At Tollense, these 
bronze-wielding, mounted warriors might 
have been a sort of officer class, presiding 
over grunts bearing simpler weapons. 

But why did so much military force con- 
verge on a narrow river valley in northern 
Germany? Kristiansen says this period 
seems to have been an era of significant up- 
heaval from the Mediterranean to the Bal- 
tic. In Greece, the sophisticated Mycenaean 
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Bronze ax 

Ax heads like this one were used as weapons and 
also for chores during the Bronze Age. They were 
traded and even hoarded as a form of wealth. 


Bronze arm ring 
Decorated jewelry shows that at least 
some warriors were high-status. 


civilization collapsed around the time of the 
Tollense battle; in Egypt, pharaohs boasted 
of besting the “Sea People,” marauders from 
far-off lands who toppled the neighboring 
Hittites. And not long after Tollense, the 
scattered farmsteads of northern Europe 
gave way to concentrated, heavily fortified 
settlements, once seen only to the south. 
“Around 1200 B.C.E. there’s a radical change 
in the direction societies and cultures are 
heading,” Vandkilde says. “Tollense fits 
into a period when we have increased 
warfare everywhere.” 

Tollense looks like a first step toward a 
way of life that is with us still. From the 
scale and brutality of the battle to the pres- 
ence of a warrior class wielding sophisti- 
cated weapons, the events of that long-ago 
day are linked to more familiar and recent 
conflicts. “It could be the first evidence of 
a turning point in social organization and 
warfare in Europe,’ Vandkilde says. 


Andrew Curry is a freelance writer 
based in Berlin. 
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_ A Big Bang 
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Splicing machine. A major building 
block of the mRNA splicing machine, 

as revealed by single-particle electron 
cryomicroscopy. The 1.8 MDa human 
U4/U6.U5 triple small nuclear 
ribonucleoprotein complex is shown 

at 7 A resolution. Ribbon diagram 
representations of some of the molecular 
components are shown within the 
cryo-EM structure. 


1390 25 MARCH 2016 + VOL 351 ISSUE 6280 


Published by AAAS 


biology 

3D snapshots reveal 
dynamics of the 
spliceosome on the 
mRNA splicing pathway 


By Jamie H. D. Cate 


ook at a protein-coding gene in the 
genome of any eukaryote—be it ani- 

mal, plant, fungus, or protist—and 

you will likely find the coding region 
fragmented by intervening sequences 
known as introns. When the gene is 
transcribed, these introns have to be re- 
moved from the pre-messenger RNA 
(pre-mRNA) before a protein can be 
made. How these introns are re- 
moved has been studied inten- 
sively for decades without the 
aid of a three-dimensional map 

of the highly dynamic machine 

at the heart of the process: the 
spliceosome. On page 1416 of 
this issue, Agafonov et al. re- 
port the first molecular-resolution 
reconstruction of a central assembly of the 
human spliceosome, the U4/U6.U5 triple 
small nuclear ribonucleoprotein (tri-snRNP) 
complex, using cryo-electron microscopy 
(cryo-EM) (1). Together with high-resolution 
cryo-EM reconstructions of spliceosome 
assemblies from fungi (2-5) and the x-ray 
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crystal structure of the U1 snRNP (6), these 
structural models of the splicing machinery 
launch a new era in understanding eukary- 
otic gene regulation. 

The spliceosome is an ancient RNA- 
protein machine in which the RNA is the 
catalytic engine (7), a property shared 
with other ancient cellular machines that 
may be relics from an “RNA world” before 
proteins and DNA appeared. Splicing out 
introns from a pre-mRNA is both a highly 
complex and highly regulated process, and 
gives a single gene the potential to encode 
many different protein variants with differ- 
ent properties and functions. Furthermore, 
evidence is accumulating that defects in 
the splicing pathway may be responsible 
for a number of human diseases (8). 

Splicing cycles through a series of steps 
in which the spliceosome assembles on the 
intron-containing pre-mRNA, defining the 
boundaries between exons—the sequences 
ultimately retained in the mature mRNA— 
and introns (see the figure, panel A). The 
spliceosome assembles on pre-mRNA to 
form complexes named A, B, and C, de- 
pending on the step in the splicing reac- 
tion, using five snRNP complexes, the U1, 
U2, U4, U5, and U6 snRNPs. Three of these 


Pre-mRNA splicing 


mRNA 
Exons retained 


Pre-mRNA 


snRNPs join as a pre-assembled unit called 
the tri-snRNP (U4/U6.U5), the target of the 
present structural investigation (7). When 
the proper boundaries of the intron and 
flanking exons are located, the spliceosome 


“these structural models 
of the splicing machinery 
launch a new era in 
understanding eukaryotic 
gene regulation.” 


undergoes a sequence of rearrangements 
leading to ejection of the U4 snRNP and 
pairing of the U2 snRNP with U6 to acti- 
vate the splicing reaction. 

In the structural model of the human tri- 
snRNP complex, large differences are appar- 
ent when compared to the high-resolution 
reconstructions of splicing assemblies from 
yeast (/, 4, 5). These differences involve two 
key protein components of the spliceosome, 
named Brr2 and Prp8. These two proteins re- 
side within the U5 snRNP and play central 
roles choreographing the correct assembly 


Complex A 
Prespliceosome 


Complex C 
igh 


Step 2 
———_————————_ 


Human 
U4/U6.U5 tri-snRNP 


of active spliceosomes and the catalytic cycle 
of pre-mRNA splicing. Brr2 is a helicase re- 
quired to remove U4 from the precatalytic 
spliceosome, thereby allowing U2 to form 
a base-pairing interaction with U6. In the 
human U4/U6.U5 tri-snRNP complex, Brr2 
is held more than 8 nm away from U4. Re- 
sults from modeling, protein cross-linking, 
and comparison to high-resolution yeast 
U4/U6.U5 tri-snRNPs (4, 5) indicate that an- 
other protein, Sadl1, acts to sequester Brr2 
in this inactive position until Brr2 has to be 
released, to swing into place for U4 removal 
and spliceosome activation. 

A second major structural rearrangement 
proposed to be part of the spliceosome 
catalytic cycle involves the central scaffold- 
ing protein Prp8. In the human tri-snRNP 
complex, which represents a state prior to 
pre-mRNA binding, Prp8 is in an “open” 
conformation with its En and NTD1 do- 
mains 5 nm apart (J). By contrast, the struc- 
ture of the U2/U6.U5 spliceosome, which 
probably represents the reaction immedi- 
ately after splicing has occurred (9), reveals 
Prp8 in a closed conformation that brings 
the En and NTD1 domains of Prp8 together 
and holds the U2/U6 and U5 snRNAs in an 
active conformation for splicing (2, 3). 
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Splicing the message. (A) Splicing of adjacent coding sequences (exons) together requires the spliceosome to recognize boundaries between exons and introns (the sequences to 
be removed). Boundaries are marked as 5’SS and 3’SS (5’ and 3’ splice sites). The spliceosome uses a branchpoint (BP) adenosine within the intron during the splicing reaction. 
RNA-protein complexes that assemble to form the spliceosome are named U1, U2, U4, U5, and U6 snRNPs. These assemble in a stepwise manner to form complexes named A, B, B*, 
and C. High-resolution structures are now available for U1 snRNP, a preexisting U4/U6.U5 tri-snRNP, and a U2/U6.U5 spliceosome after the splicing reaction. The human U4/U6.U5 
tri-snRNP described by Agafonov et al. suggests the dynamic structural changes required in this complex to form the U2/U6.U5 core in the active spliceosome. [Adapted from (9)] 
(B) Many pre-mRNAs are alternatively spliced to form many variant mRNAs—for example, by skipping an exon. Exon skipping is ubiquitous from fungi to humans. 
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The knowledge that large-scale rear- 
rangements in two key proteins in the 
spliceosome likely play key roles in the 
cycling of the spliceosome provide a new 
perspective on prior genetic and biochemi- 
cal data, and should provide opportunities 
to further explore the functional conse- 
quences of these arrangements. Cryo-EM 
reconstructions of other steps in the splic- 
ing pathway, previously imaged at much 
lower resolution (2 to 4 nm) (8)—too low 
for accurate docking of the new models— 
are now primed to be imaged at resolutions 
better than 1 nm, given the rapid advances 
in cryo-EM in the past 3 years (10). Higher- 
resolution structures of the spliceosome in 
these other steps are likely to reveal addi- 
tional conformational changes required for 
splicing to occur. 

The recent structural models of the spli- 
ceosome in different steps of the splicing 
reaction represent a turning point for the 
field, reminiscent of the change that oc- 
curred when structures of the protein- 
synthesizing machine—the ribosome—were 
resolved in 2000 (11). Now, dozens of ribo- 
some and ribosomal subunit structures are 
determined every year. With the advent of 
high-resolution cryo-EM, the same is likely 
to be true for the spliceosome over the next 
decade. New structures will be needed 
to understand the catalytic cycle and the 
process by which pre-mRNAs can be alter- 
natively spliced to form many different ma- 
ture mRNAs encoding different proteins. 
Model organisms such as fungi, which 
were needed for the first high-resolution 
cryo-EM structures (2-5), will likely con- 
tinue to provide important insights into al- 
ternative splicing—for example, into exon 
skipping (72) (see the figure, panel B). In 
the meantime, it will be exciting to see how 
the present burst of spliceosome structural 
knowledge permeates through the field to 
inspire new genetic, biochemical, and bio- 
physical experiments aimed at unraveling 
the fundamental properties of this ancient 
regulator of gene expression. © 
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Thermal trouble in the tropics 


Tropical species may be highly vulnerable to climate change 


By Timothy M. Perez,” James T. Stroud,’ 
Kenneth J. Feeley'” 


arly Victorian naturalists marveled at 

the profusion of diversity they encoun- 

tered as they traveled from temperate 

to tropical latitudes. The inverse rela- 

tionship between latitude and species 

richness that these naturalists first 
observed is now referred to as the latitu- 
dinal diversity gradient. Various ecological 
and evolutionary explanations have been 
proposed for the latitudinal diversity gradi- 
ent. Of these, perhaps none are more rel- 
evant to contemporary conservation issues 
than Janzen’s hypothesis of latitudinal dif- 
ferences in species’ climatic tolerances and 
thermal selectivity (7). On page 1437 of this 
issue, Chan et al. (2) advance Janzen’s early 
theories by elucidating some of the poten- 
tial selective pressures imposed by climate 
and climate variability. 

In 1967, Daniel Janzen published his 
seminal treatise discussing why “mountain 
passes are higher in the tropics” (J). He ar- 
gued that in climates with low variability, as 
occur throughout much of the tropics, spe- 
cies should evolve to have narrow thermal 
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tolerances. In contrast, the high climatic 
variability that occurs in most temperate re- 
gions should select for broader thermal tol- 
erances. Therefore, species at lower latitudes 
will generally have smaller elevational ranges 
because they have narrower thermal toler- 
ances, making mountains less surmountable, 
and hence “higher, from the perspective of 
tropical species. The real importance of Jan- 
zen’s ideas lies in the revelation that thermal 
tolerances are traits shaped by selection and 
that they manifest in the realized elevational 
and geographic ranges of species. 

However, climate varies not just with lati- 
tude and elevation, but also through time. 
Increasing atmospheric concentrations of 
CO, and other greenhouse gases are driving 
rapid changes in Earth’s climate. Alongside 
rising mean annual temperatures, there have 
been dramatic and complex changes in the 
temporal variability of climatic conditions 
over the past century (3). Compared to their 
temperate counterparts, tropical species are 
particularly vulnerable to changes in global 
climate, because they have evolved under 
stable climates and thus have narrow ther- 
mal niches (see the figure) (4, 5). In other 
words, just as mountain passes are “higher 
in the tropics,’ so, too, are the effects of cli- 
mate change on species predicted to be more 
severe in the tropics—despite absolute rates 
of warming being slower there than at higher 
latitudes (6). 
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Using data from more than 16,000 verte- 
brate species and over 150 elevational tran- 
sects from around the globe, Chan et al. 
provide macroecological evidence supporting 
Janzen’s original thesis, inject the pertinent 
context of climate change, and—perhaps 
most importantly—identify how temporal 
variation in climatic factors is likely to se- 
lect for the breadths of species’ thermal tol- 
erances. More specifically, the authors show 
that high seasonal temperature variability 
and low diurnal temperature variability both 
favor thermal generalist species over special- 
ist species with narrower thermal tolerances. 

Several studies have shown that thermal 
specialists will need to quickly migrate and/ 
or evolve to track the movement of suit- 
able climatic zones (6), potentially leading 
to extinctions as some species fail to keep 
pace (7). The results of Chan et al. suggest 
that additional evolutionary pressures may 
act against thermal specialists at a global 
scale. Long-term increases in seasonal tem- 
perature variability (3) due to the increasing 
frequency and severity of extreme climatic 
events (8), coupled with decreases in diurnal 
temperature variability due to faster night- 
time versus daytime warming (9), will both 
select against thermal specialists. The com- 
bined ecological and evolutionary pressures 
of climate change on specialists may even- 
tually lead to global biodiversity losses and 
biotic homogenization. 

To better predict the impacts of climate 
change on biodiversity, further research is 
paramount, especially on tropical species, 
which are expected to be most sensitive 
to changes in climate and climate variation. 
Several influential and oft-cited reviews claim 
to have uncovered coherent fingerprints of 
climate change across Earth’s ecosystems 
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species such.as those living in these 
jountains in Pahang, Malaysia, tend 
to have narrow thermal tolerances 
and may therefore be especially 
vulnerable to.climate change: 


(0, 11). In reality, however, a clear under- 
standing of how climate change is affecting 
tropical species is precluded by a paucity of 
studies, and hence data, from low-latitude 
systems. This geographic bias is further mag- 
nified by taxonomic biases: Nearly all studies 
that do exist for the tropics come from just 
a handful of taxa, generally dominated by 
endothermic vertebrates, which may be un- 
representative of broader patterns in other 
species groups. Geographic and taxonomic 
biases result, in part, from the traditional dif- 
ficulties of working with diverse taxa in often- 
inaccessible locations. Growing collaborative 
data networks, combined with new methods 
of large-scale data collection (12), are helping 
researchers to bypass some of these limita- 
tions and will hopefully soon provide us with 
amore complete understanding of how tropi- 
cal species are responding to climate change. 


Thermal 
generalists 


High vulnerability 
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The potential effects of climate change 
on natural systems are complex and remain 
poorly understood. However, the ecological 
and evolutionary risks to specialist species 
appear particularly ominous. A lamentable 
lack of basic ecological and biogeographic 
data from the tropics limits our ability to 
extrapolate macroecological patterns or 
to model future environmental responses 
to changes in climate. Chan et al. present 
a novel framework for how long-term and 
short-term climatic variability combine 
to shape the evolution of thermal niche 
breadths and hence geographic distribu- 
tions. Additional studies looking at the 
adaptability and plasticity of climatic tol- 
erances of species are now required to im- 
prove predictions of how these species will 
respond to increasing environmental vari- 
ability in a rapidly changing world. & 
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Why latitude matters in climate change. Tropical species, on average, have narrower climatic niches and greater 
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PHYSICS 


A benchmark for materials simulation 


Material properties can now be predicted reliably from first-principles calculations 


By Chris-Kriton Skylaris 


ensity functional theory (DFT) stands 

out from all first-principles quantum 

mechanical methods for the simula- 

tion of materials, as it enables very 

good approximations for the compli- 

cated components of electronic mo- 
tion called exchange and correlation. DFT 
is the method of choice for many materials 
simulations because of the availability of gen- 
eral-purpose programs that can perform cal- 
culations on any material. Results obtained 
with one DFT program 
need to be reproducible 
by any of the other DFT 
programs, and this has 
not been straightforward 
up to now. On page 1415 
of this issue, Lejaegh- 
ere et al. (1) describe an 
extensive effort by de- 
velopers of the major 
solid-state DFT codes to 
provide a unified and re- 
producible benchmark 
of precision for their 
calculations based on a 
reliable criterion, the so- 
called A gauge. Using the 
A gauge, the authors found that the level of 
precision that can be achieved today in DFT 
calculations of elemental crystalline solids 
is comparable to the precision of the most 
advanced techniques for experimental mea- 
surement of the properties of materials. The 
work leads to the conclusion that the DFT 
simulation of elemental crystalline solids 
is a (computationally) solved problem, but 
also poses the question of whether we can 
achieve the same levels of validation and re- 
producibility for more complex simulations 
of materials involving several elements and/ 
or several methods. 

First-principles quantum mechanical cal- 
culations use the fundamental equations of 
quantum theory that govern the behavior 
of electrons, atoms, and molecules. In prin- 
ciple, quantum theory allows us to compute 
any observable property of materials with 
extremely high accuracy. However, this ca- 
pability can only be realized at the expense 
of appreciable computational effort; for 
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Multiscale simulations of 
bio/nano/materials systems 


DFT simulations of complex 


DFT simulations of single- 
element crystaline solids 


most cases, the quantum mechanical equa- 
tions cannot be solved analytically and must 
be approximated numerically. A variety of 
implementations exist that are based on dif- 
ferent numerical approximations, and at the 
theoretical limit of infinite computational 
power, they are expected to produce the 
same answers. However, in real calculations 
it is difficult to know how the numerical ap- 
proximations used by each program affect 
its results. Lejaeghere et al. demonstrate 
this by comparing published calculations of 
the lattice constant of crystalline silicon and 


How the calculations measure up. The A gauge approach described by Lejaeghere et al. allows us to 
validate calculations between different DFT programs. Can an equivalent criterion be derived for DFT 
simulations of complex materials and eventually for multiscale simulations? 
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showing that in the early years of develop- 
ment of such programs, the error in precision 
was larger than the difference from the value 
measured in experiments. 

The development of DFT by Kohn (2) was 
a major breakthrough in first-principles 
calculations. DFT is an alternative formula- 
tion of quantum theory in which the elec- 
tronic density is the central quantity rather 
than the wave function. This is a dramatic 
simplification, as the electronic density 
is a mathematical function of only three 
geometrical variables x, y, and z, whereas 
the wave function is a function of 3N vari- 
ables, that is, three variables for each of the 
N particles simulated. As a result, DFT, in 
the form developed by Kohn and Sham (3), 
has found extensive use in simulations of 
materials. Although DFT is formally an ex- 
act theory, there is no explicit expression 
for the so-called exchange and correlation 
energy that describes interactions between 
the electrons, so this term has to be ap- 
proximated. Over the past few decades, a 
hierarchy of increasingly accurate approxi- 
mations of the term has been developed (4). 
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DFT has proved its value in calculations in 
an impressive range of applications such as 
drug design (5), catalysis (6), crystal struc- 
ture prediction (7), nanoelectronics (8), and 
geophysics (9), to name just a few. 

The tremendous success of DFT has been 
enabled by the development of highly sophis- 
ticated programs that are general-purpose 
and can perform calculations on any mate- 
rial or molecule. These computational tools 
have a high level of complexity, as a conse- 
quence of the fact that a host of numerical 
techniques are required to solve the DFT 
equations (10). As a result 
of this complexity, DFT 
programs had to be devel- 
oped over many years by 
dedicated communities, 
and each such community 
has acquired expertise on 
how to set up calculations 
with their code to achieve 
the required level of pre- 
cision for each applica- 
tion. These codes have 
now matured and become 
established, and because 
they were developed ac- 
cording to a “black-box” 
philosophy that allows 
the user to control the approximations via 
a small set of input parameters, they are 
now available as tools for research by non- 
experts. As this transition from “community 
products” to general-purpose research tools 
is taking place, it is imperative to be able to 
compare and tune the quality of calculations 
between different codes. Results obtained 
with one code can only be credible if they can 
be reproducible by any of the other codes at 
the same level of DFT theory (that is, the DFT 
exchange-correlation functional). 

Lejaeghere et al. outline the extensive ef- 
forts by developers of the major solid-state 
DFT codes to provide a unified and repro- 
ducible benchmark of precision for DFT 
calculations. This multinational consortium 
evaluated the major codes against each other 
using a reliable criterion, the A gauge (see 
the figure). DFT calculations of the equation 
of state of elemental crystalline solids have 
been compared among all the codes in the 
study. An outcome of this work is that all 
the DFT codes were able to produce results 
at the same level of precision as the most 
advanced experimental techniques for mea- 
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suring structural and electronic properties 
of materials. Thus, “computer experiments” 
can be used on a par with experimental in- 
vestigation. Users of DFT codes now have a 
dependable estimate for the level of precision 
of their results and a confidence of reproduc- 
ibility by other DFT codes. This work has far- 
reaching implications, as it affects the entire 
community of DFT users, in fields as diverse 
as metallurgy and biochemistry. 

Being able to do such accurate quantum 
calculations is insufficient when the goal 
is to solve complex problems of techno- 
logical relevance. Molecules, biomolecules, 
and materials are neither isolated nor at a 
temperature of 0 K. On the contrary, they 
interact heavily with each other and their 
environment (for example, a solvent) and 
are in constant thermal motion. To make 
an impact in grand challenges such as un- 
derstanding the function of a living cell or 
a nanodevice, we will need to tackle much 
larger (thousands to millions of atoms) 
length scales than can be approached with 
conventional DFT. Part of the answer to this 
challenge will be provided by linear-scaling 
DFT approaches (77), which can treat much 
larger numbers of atoms. Inevitably, how- 
ever, multiscale methods that couple DFT 
with coarser descriptions such as classical 
atomistic force fields (12), and eventually 
continuum models, will be needed. These 
multiscale simulations will also need to 
describe how the materials evolve in time, 
so the choice of a configurational sampling 
problem that can be tackled with meth- 
ods such as molecular dynamics (13), with 
implementations able to take advantage of 
the largest supercomputers (4), is equally 
important. Thus, a new, greater challenge is 
posed for the field of materials simulation: 
Can we have the same confidence in the 
reproducibility and precision of multiscale 
simulations as we have now for simple DFT 
calculations? Only time will tell. = 
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IONIC TRANSPORT 


Two-dimensional nanofluidics 


Restacked exfoliated sheets create interconnected 
nanofluidic channels for ion transport 


By Andrew R. Koltonow and 
Jiaxing Huang 


he remarkable electronic properties of 
graphene and related two-dimensional 
(2D) materials result from the con- 
finement of electrons within the ma- 
terial. Similarly, the interstitial space 
between 2D materials can enable the 
2D confinement of ions and electrolytes 
and alter their transport. Many different 2D 
sheets can be obtained by exfoliation of natu- 
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ral layered materials (7), and an exfoliation- 
reconstruction strategy can convert powders 
of layered materials into continuous, robust 
bulk forms in which lamellar nanochannels 
occupy a substantial volume fraction (up to 
several tens of percent). Nanofluidics, which 
enables the manipulation of confined ions 
and electrolytes, has applications in electro- 
chemical energy conversion and storage, bio- 
sensing, and water purification. 

Electrolytes exhibit drastically different 
properties when confined in nanochannels. 
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Confining ion flow. (A) Lamellar film with massive arrays of 2D nanofluidic channels can be made by the exfoliation- 
reconstruction approach, as illustrated with models of graphene oxide (GO) sheets that are terminated with negatively 
charged carboxyl groups. (B) Debye layers of neighboring sheets overlap to create unipolar 2D ion channels with 


greatly enhanced cation conductivity. 
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For example, in bulk solution, cations and 
anions simultaneously move along oppo- 
site directions to generate ionic current. 
However, in channels narrower than the 
Debye length of the electrolyte (a measure 
of how far electrostatic effects can persist), 
the surface charges on the inner walls re- 
pel ions of the same charge and attract the 
counterions, making them the dominating 
charge carriers (2). Such unipolar ionic 
transport can enhance ionic conductivity 
up to several orders of magnitude. Nano- 
channels that enable such transport can be 
fabricated in bulk materials, but such “top- 
down” methods are rather expensive and 
difficult to scale up. The construction of 
nanofluidic channels with the 2D sheets, a 
“bottom-up” approach, can be done simply 
by casting or filtration of nanosheets dis- 
persed in solution. 

The surface properties and the spac- 
ing of the 2D nanochannels can be con- 
veniently controlled by modifying the 
starting sheets. No matter how electrolytes 


“The surface properties 

and the spacing of the 2D 
nanochannels can be 
conveniently controlled by 
modifying the starting sheets.” 


pass through the film, horizontally or ver- 
tically, they flow through the same set of 
2D channels, with the only difference being 
flux (see the figure, panel A). To create a 
robust film with uniform lamellar spacing, 
the 2D building blocks should have uni- 
form thickness and high aspect ratios. 

A number of 2D materials are already 
available for this purpose. For example, 
filtration of graphene oxide (GO) sheets 
(graphene functionalized with oxygenated 
groups, such as carboxylates) leads to films 
with interlayer spacing of ~1 nm that can be 
tuned by changing the degree of hydration. 
Anions are excluded by the negatively 
charged sheets, enabling cation transport 
controlled by surface charge (see the figure, 
panel B) (3). Guo et al. later demonstrated 
that mechanically pumping electrolyte 
through such 2D channels resulted in uni- 
polar flow of cations, generating electrical 
current on the order of 0.1 A m®? bar? (4). 
Two-dimensional nanochannels constructed 
with vermiculite clay walls supported high 
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proton conductivities approaching those 
of proton-transport polymeric membranes. 
The vermiculite channels have extraordi- 
nary thermal stability and retain their func- 
tions even after baking at 500°C in air (5). 

Packing defects such as voids and dislo- 
cations are to be expected in the lamellar 
films made of irregularly shaped sheets with 
polydisperse sizes (see the figure, panel A). 
The effective total volume of the 2D nano- 
channels can be estimated by comparing the 
value of ionic conductance to that of a bulk 
channel. Recent work by Cheng et al. aims 
to quantify the impact of these defects on 
ionic transport (6). Their continuum ion dif- 
fusion simulations treat a GO film as a stack 
of sheets with uniform size and few-nano- 
meter-sized pinholes. This simple model of 
the nanofluidic channel walls manages to re- 
produce the unipolar transport observed in 
experiments, as well as the influence of the 
membrane’s physical parameters on ionic 
transport. Experimentally, solvent exchange 
allows the interlayer spacing of the film to 
be continuously tuned, and results from the 
corresponding ionic conductivity measure- 
ments converged on the model’s prediction. 
Interlayer spacing critically influences the 
behaviors of confined ions and can be af- 
fected by local stacking defects, so it will be 
useful to adopt a scanning type of x-ray dif- 
fraction technique to map the spacing over 
entire films. 

Enhanced ionic conductivity through 2D 
nanofluidic membranes can be used to cre- 
ate electrochemical devices, especially for 
those with in-plane geometry (7). It is also 
very attractive for designing new ion-selec- 
tive membranes, potentially allowing new 
applications under unprecedentedly ex- 
treme conditions (5). Here, the remaining 
challenge is to significantly increase the 
cross-membrane flux, perhaps by realign- 
ing the horizontal nanochannels toward 
the vertical direction without losing the 
film’s structural integrity. New assembly 
strategies will be needed to create robust 
bulk lamellar materials with tunable chan- 
nel orientations. 
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Don't forget 
the surface 


Surface effects play a 
key role in cloud droplet 
formation 


By Barbara Noziere 


louds are an essential source of fresh 

water to continents and all ecosys- 

tems (7) and a major cooling factor in 

the climate budget (2). Yet, predicting 

their formation remains a challenge 

(2). In the atmosphere, cloud drop- 
lets form not from water vapor alone but 
through condensation of water on aerosol 
particles called cloud condensation nuclei 
(CCN) (3). On page 1447 of this issue, Ruehl 
et al. (4) show experimentally that surface 
effects play a central role in cloud droplet 
formation from CCN. 

The first step in predicting cloud forma- 
tion is to accurately predict the concentra- 
tion of CCN in the air. For the fine aerosol 
particles in the atmosphere (with radii be- 
tween 50 and 150 nm), the contribution to 
the CCN population is determined by the 
chemical composition. This is because, as 
first explained by Kohler in a paper pub- 
lished 80 years ago (5), some chemicals 
present in these particles can reduce the en- 
ergy barrier limiting cloud droplet forma- 
tion either by reducing the vapor pressure 
around them (Raoult effect) or by lowering 
their surface tension (see the figure). The 
magnitudes of both effects must be known 
to predict cloud droplet formation in the 
atmosphere. 

To study the early steps of cloud drop- 
let formation experimentally, scientists 
have built “on-line” instruments that al- 
low atmospheric or laboratory-generated 
submicrometer-scale aerosols to be sam- 
pled in an air flow. In these instruments, 
the aerosols can be exposed to a range of 
controlled relative humidities, followed by 
measurements of either the growth of the 
particles upon water uptake or the number 
of newly formed droplets (6, 7). These mea- 
sured quantities are indirectly linked to the 
Raoult effect and surface tension, making 
accurate prediction difficult. The role of 
aerosol chemical composition on the Raoult 
effect is now acknowledged and widely 
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studied (6-8), but that of surface tension is 
more controversial. 

Surface tension—the force that keeps the 
droplets cohesive—is anomalously large in 
pure water because of hydrogen bonding. 
According to Kohler theory (5), surfactants 
in atmospheric aerosols should lower the 
surface tension of growing droplets com- 
pared to pure water, thereby facilitating 
their growth and increasing CCN concen- 
trations. However, most instruments built 
to study cloud droplet formation have been 
unable to detect surface tension effects on 
CCN concentrations (6, 7). As a result, the 
CCN concentrations measured with these 
instruments can only be explained by as- 
suming that no surface effect is present. 
Practically all investigations of cloud forma- 
tion assume that surface tension reduction 
is nonexistent and that the surface tension 
of droplets is that of pure water (7, 8). 


In their study, Ruehl et al. use an original 
experimental setup that can monitor the 
growth of droplets until they reach the criti- 
cal energy barrier (see the figure). The re- 
sults show that when surface-active organic 
compounds are present in the aerosol par- 
ticles, initially as a film, the onset of droplet 
formation is underestimated by ~50%. This 
represents the error made by current mod- 
els that neglect surface effects. 

To account for the observations, the au- 
thors propose to replace the widely used 
one-dimensional equation linking the 
droplet’s surface tension and its surfac- 
tant concentration gradient with a two- 
dimensional “compressed film” model that 
takes into account the area occupied by the 
molecules at the surface. In stark contrast 
with the one-dimensional model, the com- 
pressed film model predicts that the vast 
majority of the surfactant molecules are 


| 7 Growing cloud droplet | 


Relative humidity 


Growing aerosol particle 


e Vapor pressure-reducing compound 
_ we Surface-active organic compound 


Formation of a cloud droplet from an aerosol particle. To grow spontaneously into a cloud droplet, an aerosol particle 
must overcome the energy barrier represented by the maximum of the blue curve. Chemical compounds in the particle can 
lower this barrier, thus favoring the formation of cloud droplets, by either reducing the vapor pressure around them (Raoult 
effect, compounds represented by the blue dots) or reducing their surface tension (organic surfactants, represented by the 
red dots with organic chains). The effect of aerosol chemical composition on the Raoult effect has been widely studied, but 
surface tension effects have been harder to detect. Ruehl et al. now provide experimental evidence showing that surface- 
active compounds can also reduce the energy barrier by reducing surface tension. 
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Explaining cloud formation. Cloud droplets 
grow from aerosol particles. The chemical 
composition of these particles is crucial for 
whether a cloud droplet forms. 


at the droplet’s surface, resulting in sub- 
stantial surface effects. The results imply 
that the systematic assumption that the 
surface tension of growing cloud droplets 
is that of pure water is at least partly in- 
correct and that the reported agreements 
between measured and predicted CCN con- 
centrations in the atmosphere are based on 
flawed assumptions. 

The organic compounds that Ruehl et 
al. investigated are very abundant in at- 
mospheric aerosols (dicarboxylic acids). 
Furthermore, a recent study has found very 
low surface tension and large surfactant 
concentrations in fine atmospheric aerosols 
(9). Ruehl et al.’s results are thus likely to 
be relevant for the real atmosphere. After 
decades of investigations with the same 
types of instruments, their work shows how 
critical the development of alternative tech- 
niques is for a full understanding of cloud 
formation processes. Techniques that mea- 
sure directly the effects of the Raoult effect 
and surface tension on micrometer- or sub- 
micrometer-scale particles would be espe- 
cially valuable by allowing assumptions to 
be tested. Searching for evidence of the role 
of surfactants on cloud formation directly 
in the atmosphere is also a priority. @ 
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CANCER 


The oncogene makes its escape 


Disruptions in 3D genomic architecture allow cancer 
genes to evade transcriptional silencing 


By Jeremiah Wala and Rameen Beroukhim 


ar from a random tangle, cellular 

DNA is packed into the nucleus with 

astounding precision. Indeed, there 

is growing appreciation for how the 

three-dimensional (3D) organization 

of the genome contributes to control- 
ling gene expression. For instance, loops of 
DNA called insulated neighborhoods can 
protect small groups of genes from silenc- 
ing or activation (7). If cancer can result 
from dysregulation of gene expression (2), 
then an enticing hypothesis is that disrupt- 
ing insulated neighborhoods may lead to 
increased transcription of cancer genes. 
On page 1454 of this issue, Hnisz et al. (3) 
use tumor-derived sequencing data and 
targeted deletions in cells to show that dis- 
ruption of insulated neighborhoods leads 
to activation of proto-oncogenes—genes 
with the potential to cause cancer. These 
findings strongly support disruption of 
chromatin structure as causally linked to 
tumorigenesis, and suggest that such dis- 
ruptions may be the hidden culprit driving 
many tumors. 

Physical separation is maintained be- 
tween transcriptionally active and inac- 
tive regions of the genome (4), and recent 
studies have identified units of hundreds 
to thousands of kilobases of DNA, called 
topologically associating domains (TADs), 
where genes exhibit close spatial prox- 
imity and correlated expression (5). The 
boundaries of these TADs are occupied by 
CCCTC-binding factor (CTCF), the primary 
protein that mediates long-range chroma- 
tin interactions in mammals (6). Further 
transcriptional control can occur inside 
CTCF-mediated DNA loops such as insu- 
lated neighborhoods. These loops can stop 
the spread of silencing heterochromatin, 
physically separate genes from activating 
enhancers, or sequester genes together with 
powerful transcriptional activators known 
as superenhancers (1). 

Hnisz et al. asked whether changes to 
this three-dimensional structure of DNA 
contribute to the activation of proto-on- 
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cogenes. Proto-oncogenes are often not 
expressed, and in a human cell-line model 
of T cell acute lymphoblastic leukemia (T- 
ALL), Hnisz et al. found that many of these 
dormant proto-oncogenes lie in insulated 
neighborhoods. To determine whether 
opening of these DNA loops is sufficient to 
allow transcription of the proto-oncogenes, 
the authors used the DNA editing technique 
called CRISPR (clustered regularly-inter- 
spaced short palindromic repeats )-Cas9 to 
delete the CTCF sites that mark the bound- 
aries of insulated neighborhoods containing 
the genes TALI and LMO2. TALI and LMO2 
are transcription factors that drive hemato- 
poiesis and can be aberrantly activated in T 
cell acute lymphoblastic leukemia. Remark- 
ably, despite targeting regions thousands of 
base pairs away from the gene body, both 
deletions flanking an insulated neighbor- 
hood were sufficient to induce proto-onco- 
gene expression in nonmalignant cells. The 
authors further made a key connection with 
patient data: Recurrent focal deletions tar- 
geting the same region were associated with 
increased TALI expression in T cell acute 
lymphoblastic leukemia. 

One model behind the increased expres- 
sion is that releasing DNA loops allows 
physical contact between proto-oncogenes 
and nearby enhancers. Such commandeer- 
ing of enhancers to drive expression is 
reminiscent of enhancer hijacking, where 
superenhancers are brought into proxim- 
ity of proto-oncogenes through genome re- 
arrangement events (7). Interestingly, the 
TAL] locus can also undergo somatic muta- 
tions in T cell acute lymphoblastic leukemia 
to create a new superenhancer (8), further 
demonstrating the arsenal of tools with 
which tumors can dysregulate transcription. 

CRISPR-Cas9-mediated deletions of in- 
sulated neighborhood boundaries can acti- 
vate proto-oncogenes in the laboratory, but 
how prevalent are such disruptions in real 
tumors? CTCF binding sites are known to 
be frequently somatically mutated (9); to 
identify whether this is due to positive se- 
lection for chromatin disruption, Hnisz et 
al. separately analyzed mutations in non- 
boundary CTCF sites and boundary CTCF 


Chromatin architecture and gene expression. 

A) DNA has a hierarchical structure, with units of 
hundreds to thousands of kilobases organized into 
opologically associating domains (TADs). Smaller 
subdomains called insulated neighborhoods are 
organized by CTCF and cohesin binding at insulators. 
B) Insulated neighborhoods provide protected units 
where genes can be co-regulated. (C) Otherwise 
well-behaved proto-oncogenes can become activated 
in cancer by disrupting the structure of insulated 
neighborhoods through deletions or mutations of 
boundary CTCF binding sites. 
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sites in different cancer types. If the target 
of CTCF site alterations is destruction of 
insulated neighborhoods, then only bound- 
ary CTCF sites should be enriched for mu- 
tations. Across a large pan-cancer cohort, 
the authors observed a factor of >2 enrich- 
ment for boundary CTCF site mutations. 
This enrichment was particularly strong 
in liver and esophageal carcinomas, where 
boundary CTCF site mutations were also 
significantly more likely to be found near 
known oncogenes. Whether this enrich- 
ment is driven primarily by activation of 
proto-oncogenes is not clear, and further 
analyses are needed to uncover the specific 
gene targets driving CTCF site alterations 
within different tumor types. Such studies 
may also be useful for clinical genotyping, 
where identification of activated oncogenes 
is a key step in applying the optimal tar- 
geted therapy. 

Genetic events that disrupt insulated 
neighborhoods may be just one of many ways 
that cells alter their 3D chromatin structure 
to dysregulate gene expression. Recently, 
Flavahan et al. reported that disruption 
of TADs by DNA methylation of boundary 
CTCF sites allows a distant active enhancer 
to interact with and drive a key oncogene 
in brain tumors (10). Together with the find- 
ings of Hnisz et al., these pioneering stud- 
ies highlight the diversity of mechanisms by 
which chromatin structure may be targeted 
and suggest that modulating 3D chromatin 
structure may be widespread in cancer. 

By showing that disruption of insulated 
neighborhoods leads to activation of proto- 
oncogenes, Hnisz et al. describe a previously 
unrecognized mechanism by which cancers 
may escape transcriptional regulation. This 
study adds to an expanding understanding 
of the deep impact that alterations outside 
of protein-coding regions can have in driv- 
ing the expression of cancer genes (11-13). 
Future research aimed at deciphering such 
noncoding alterations in cancer will need to 
account for perturbations to the 3D archi- 
tecture of the genome, while also being alert 
to indications of novel methods of transcrip- 
tional dysregulation. 
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Ethics review for international 
data-intensive research 


Ad hoc approaches mix and match existing components 


By Edward S. Dove,'* David Townend,” 
Eric M. Meslin,’? Martin Bobrow,** 
Katherine Littler,* Dianne Nicol,’ 

Jantina de Vries,*® Anne Junker,° Chiara 
Garattini,’° Jasper Bovenberg," Mahsa 
Shabani,” Emmanuelle Lévesque,” Bartha 
M. Knoppers” 


istorically, research ethics committees 

(RECs) have been guided by ethical 

principles regarding human experi- 

mentation intended to protect par- 

ticipants from physical harms and to 

provide assurance as to their interests 
and welfare. But research that analyzes large 
aggregate data sets, possibly including de- 
tailed clinical and genomic information of in- 
dividuals, may require different assessment. 
At the same time, growth in international 
data-sharing collaborations adds stress to 
a system already under fire for 
subjecting multisite research to 
replicate ethics reviews, which 
can inhibit research without improving the 
quality of human subjects’ protections (J, 2). 
“Top-down” national regulatory approaches 
exist for ethics review across multiple sites 
in domestic research projects [e.g., United 
States (3, 4), Canada (5), United Kingdom, (6), 
Australia (7)], but their applicability for data- 
intensive international research has not been 
considered. Stakeholders around the world 
have thus been developing “bottom-up” solu- 
tions. We scrutinize five such efforts involv- 
ing multiple countries around the world, 
including resource-poor settings (table S1), to 
identify models that could inform a frame- 
work for mutual recognition of international 
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ethics review (i.e., the acceptance by RECs of 
the outcome of each other’s review). 

Data-intensive projects often raise ethical 
concerns for which RECs have little guid- 
ance. Data can be collected from consenting 
participants at one site but stored, analyzed, 
or linked with data sets elsewhere. Data 
are typically stored for long periods and 
can be reused and (re)linked. Particularly 
problematic is that perceived and legislated 
ownership of data and the responsibility to 
authorize data sharing varies across jurisdic- 
tions. Investigators and RECs must consider 
the security of data management, how the 
privacy of participants will be assured, and 
the overall governance (e.g., use and access) 
of a data set. 

We exclude from our analysis clinical trials 
work, which is led by the International Coun- 
cil (formerly Conference) on Harmonisation 
(8), although we note increasing convergence 
of clinical trials with large, heterogeneous 
data sets (9). 


MODELS AND PRINCIPLES. Our analysis 
revealed three general ethics review mod- 
els—reciprocity, delegation, and federa- 
tion—that clarify and add to what currently 
exists in some jurisdictions and integrate 
existing ethics review approaches in inno- 
vative ways (see table). Each project used 
several mechanisms to achieve greater 
cross-jurisdictional mutual recognition of 
ethics review (table S2). Prior and ongo- 
ing engagement with RECs, institutions, 
or governmental bodies to achieve REC 
alignment (e.g., a memorandum of under- 
standing) can be effective. A well-resourced 
process for developing tools (e.g., custom- 
ized agreements or face-to-face meetings) 
for improved REC review is critical, as is (if 
possible) an opportunity to pilot test them 
before full implementation. 

Ethics review for data-intensive inter- 
national research should be founded on at 
least two principles. First, projects impos- 
ing similar risks on research participants 
should be subjected to similar levels of 
scrutiny by all RECs. Second, if we assume 
that procedural and regulatory alignment is 
in place, once an ethics review opinion has 
been provided, each jurisdiction should not 
require further de novo review. This does 
not foreclose local accommodations for is- 
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Three models for building ethics review mutual recognition for data-intensive international research 


ADVANTAGES 
RECIPROCITY 
An institution, funder, or 
regulator/government in 
one jurisdiction accepts the 
completed ethics review from 
another jurisdiction and vice 
versa through collaborative 


local context 


recognition of equivalent class of projects implementation stage 
processes and/or standards 
DELEGATION Reduces the potential for inconsistency Challenge in determining how 


Before review, an institution, 
funder, or regulator/ 
government delegates ethics 
review responsibilities to one 
or several existing designated 
RECs through agreement 


FEDERATION 
Institutions, funders, or 
regulators/governments 
create a central REC with 
representation from multiple 
jurisdictions 


sues pertinent to local context (e.g., data 
storage or recruitment methods). 

RECs are likely to be more supportive of 
mutual recognition frameworks if accept- 
able safeguards are in place and there are 
guarantees that, in case of a personal data 
breach, participants can bring an action (i) 
individually in their own jurisdiction and 
(ii) collectively. RECs could have a role in 
working with other bodies, such as data ac- 
cess committees, data protection authorities, 
funding agencies, journals, and research em- 
ployers to assure that storage and use of data 
are properly monitored and reported, which 
includes material data breaches and action 
taken. Although there will always be some in- 
consistency within and between RECs, there 
must also be core opinions and underlying 
rationales deemed acceptable by researchers, 
research participants, and society (10). 

Any successful model of ethics review for 
data-intensive international research must 
sustain key functions: robust protection of 
research participants; the gatekeeping role 
of a REC during the research life cycle; in- 
tegrity of the ethics review system and of 
each REC; and trust in the ethics review 
standards and processes to collect, store, 
share, and access data. 

Although no one model will fully suit all 
data-intensive international research and 
multiple variations can be devised, we believe 
that the models identified here can improve 
on the status quo of replicate REC review. 
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Helps build agreement on research 
participant protections while respecting 


Flexibility with review standards 


Researchers can channel energy and 
resources into one or a few RECs Challenge in determining how post- 


ncreased possibility for specific areas of 
ethics expertise in the designated REC(s) All-or-nothing outcome of review; no room 


Reduces inconsistency in REC review 


Drives improved standards across sites by 
encouraging a “herd instinct” local priorities 


DISADVANTAGES 


opinions) 


are “best” 


Potentially time-saving once a decision on 
equivalence is reached, if applied to a whole 


Time-consuming at the initial 


a REC is chosen 


approval activities will be handled 


for alternative reviews 


Reduces costs and duplication of efforts Challenge in developing REC structure 


and process 


Challenge in balancing cultural 


Some REC system inefficiencies remain 
(e.g., inconsistent or incompatible 


Challenge in defining whose protections 


representation, power differences, or 


PROJECTS 
Human Heredity and Health in Africa 


H3Africa): shared ethics consultation 
meetings to build trust and REC alignment 


nternational Cancer Genome Consortium 
ICGC): development of ethics review policies 


Personalized Risk Stratification for Prevention 
and Early Detection of Breast Cancer 
PERSPECTIVE): customized tools and 
agreements approved by each institution 


CGC: agreements signed between 
ministries of health 


Indiana University—Moi University 
(IU-Moi): proposed REC with members 
of each institution 


Maternal Infant Child & Youth Research 
Network (MICYRN): federated pediatric 
REC across Canada 


Challenge in getting several jurisdictions to 


agree on policy and standards 


Until the emergence of a competent and le- 
gitimate system for reviewing and steering 
data-intensive international research, we 
advocate bottom-up, ad hoc solutions, ideally 
coupled with official recognition and support 
by governments and regulators, sponsors, 
funders, institutions, and data access com- 
mittees. As models are tested and improved, 
more systemic solutions can be implemented. 

Organizations have a key role to play. For 
instance, the Global Alliance for Genomics 
and Health has developed policies on ac- 
countability, consent, and privacy and is en- 
gaging stakeholders on the research ethics 
governance of data-intensive projects (11). 
This may assist RECs that need to check 
the consistency of secondary data uses with 
the original consent forms or verify the ad- 
equacy of data protection measures or con- 
sent processes. 

In addition to moving toward common 
ethics review standards and procedural 
alignment, common conditions for exchang- 
ing data should be developed, which we 
believe would make RECs more inclined to 
mutual recognition of ethics review. 

Given the global scale of the task and the 
bottom-up nature of this approach, at this 
stage, there needs to be international com- 
mitment to test these models and variations 
to determine whether they can achieve the 
desired alignment in ethics review of data- 
intensive research. Evidence suggests that 
the current system is not working well; evi- 
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dence is now needed to show whether certain 
alternatives are better. This will necessitate 
defining metrics to evaluate the quality and 
efficiency of ethics review both in the cur- 
rent system and in the proposed models (12). 
Communication with and between RECs 
will be crucial. The era of collaborative data- 
intensive international research gives us an 
opportunity to reform the way in which eth- 
ics committees across the world work. 
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SOLAR CELLS 


Lead halides join the top optoelectronic league 


Lead halide materials have the properties needed to reach the highest photovoltaic efficiencies 


By Eli Yablonovitch 


n any solar cell that begins to approach 
the theoretical limits of performance, an 
intense internal luminescence photon 
gas must be present (see the figure) (J). 
On page 1430 of this issue, Pazos-Out6n 
et al. (2) provide evidence for such an 
internal photon gas in lead halide photovol- 
taic cells. These materials thus have proper- 
ties similar to those of GaAs and have the 
potential to be among the top- 
performing solar cell materials. 


This is scientifically remark- Incoming 
able, because these compounds __ photon (hv) 
are the first high-quality halide 
semiconductors. The materials GaAs 
thin film 


show promise for photovoltaics, 
light-emitting diodes (LEDs), 
laser refrigeration, thermopho- 
tonics, and a host of other major 
optoelectronic applications. 

Many decades ago, Tom Mc- 
Gill of the California Institute 
of Technology predicted that 
the more ionic a semiconduc- 
tor, the more tolerant it would 
be of dangling bond defects. In 
ionic materials, these dangling 
bond energy levels fall near the 
band edges, leaving the center 
of the bandgap relatively free of 
defect states. The lead halides 
appear to be a prime example 
of this effect. There are indica- 
tions from recent laser cooling 
experiments (3) that single lead 
halide crystals can have ~99% 
internal luminescence yield, a 
prerequisite for the buildup of 
an intense internal photon gas. 
Laser cooling relies on superb 
efficiency for luminescent extraction and is 
another strong indication of the potential 
performance of the lead halides. Laser cool- 
ing, LEDs, and solar cells all rely on >95% 
external extraction of photons from this in- 
ternal photon gas. 

It has been known for over 50 years that 
the Shockley-Queisser formula (4) for the 
open-circuit voltage VY, needs to be cor- 
rected from the ideal value if the external 


Department of Electrical Engineering and Computer 
Sciences, University of California, Berkeley, CA 94720, USA. 
E-mail: eliy@eecs. berkeley.edu 


SCIENCE sciencemag.org 


1 Photons reabsorbed/ 
re-emitted multiple times 


— 
“Ose 
oa 


GaAs. In this picture (bottom), phi 
before an electron-hole pair is col 


luminescence is less than 100%. The open- 
circuit voltage is penalized by 


qV.= QV, 


oc-Ideal 
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where 7, is the aforementioned external 
luminescence efficiency (5). The photon 
gas and the external luminescence are thus 
essential for achieving high voltage from a 
solar cell. This has led to the mantra that “a 
great solar cell needs to be a great LED” (2). 


Conventional picture, 1990-2007 25.1% efficiency 
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Record breakers. In the conventional picture (top), a photon in a solar cell produces 
an electron-hole pair that is collected without need for external luminescence. Recent 
studies have shown that good luminescence extraction, assisted by photon recycling, 
is required for the highest open-circuit voltages in solar cells such as those made from 
otons are reabsorbed and re-emitted many times 
ected or a luminescent red photon escapes. Pazos- 
Outén et al. now show that lead halide materials have luminescence properties similar 
to those of GaAs and may, thus, also reach maximum efficiencies. 


The idea that increasing light emission im- 
proves open-circuit voltage seems paradoxi- 
cal, as it is tempting to equate light emission 
with loss. However, basic thermodynamics 
dictates that materials that absorb sunlight 
must emit in proportion to their absorptiv- 
ity. At open circuit, an ideal solar cell would 
radiate out one photon for every photon 
that it absorbs. The external luminescence 
efficiency is a gauge of whether further loss 
mechanisms in addition to this photon emis- 
sion are present at open circuit. At the opti- 
mum power point, the voltage is reduced by 
a few kT/q from open circuit, and fully 98% 
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3 Red luminescent 


of the open-circuit photons are drawn out of 
the cell as real current. Good external extrac- 
tion at open circuit comes at no penalty in 
current at the optimal operating bias point. 
Photon recycling—that is, the reabsorption 
and reemission of photons—provides numer- 
ous opportunities for the photons to find the 
escape cone and contribute to external lumi- 
nescence, but it is not the only such mecha- 
nism. Many years ago, Lush and Lundstrom 
(6) identified another mechanism, solar 
cell surface roughness. Breaking 
plane parallel symmetry tends 
to trap the incoming sunlight, 
boosting incoming current, but 
the random internal scattering 
also allows numerous opportu- 
nities for luminescence to es- 
® cape. Photon recycling is a good 
ie option for plane parallel solar 
raf cells, but multiple elastic scatter- 
ing events can also produce the 
same external luminescence and 
therefore the same voltage boost. 
The report by Pazos-Out6n et 
B) al. shows clearly the presence of 
the internal photon gas and the 
photon recycling events that are 
one route to a high-output volt- 
age solar cell (see the figure). The 
next step will be to show that the 
superb external luminescent 
emission is compatible with het- 
erogeneous electrical contacts. 
Generally, in every step of solar 
cell fabrication, it is profitable 
and experimentally straightfor- 
ward to monitor the external 
luminescence efficiency, which 
predicts whether it will be an 
average or record cell. It is with 
external luminescence efficiency 
monitoring that solar cell efficiency records 
are broken today. 
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The social gene 


Advances in genetic research prompt a pair of scientists 
to update the “selfish gene” metaphor 


By Joseph Swift 


enetic research has moved rapidly 
since the publication of Richard 
Dawkins’s The Selfish Gene 40 years 
ago (1). In the intervening years, we 
have come to realize that many of the 
most interesting and important phe- 
nomena in human biology are not caused 
by any single gene. Processes like the im- 
mune system’s ability to recognize infec- 
tion, or the timing of our sleep-wake cycle, 
for example, are the product of many genes 
working together in a highly integrated 
way. Citing a wealth of recent research that 
explores the ways genes work together to 
produce complex biological processes, Itai 
Yanai and Martin Lercher argue that it is 
time to embrace a new, more holistic, meta- 
phor in their book, The Society of Genes. 
“Genes do indeed behave in ways that 
can be described as selfish,” they concede. 
“But genes, like humans, do not live in iso- 
lation.” It is therefore useful to think about 
our genes as members of a society in which 
different genes play specific roles. 
Rather than focus on any one gene, Yanai 
and Lercher invite the reader to step back 
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The Society of Genes 
Itai Yanai and 

Martin Lercher 
Harvard University Press, 
2016. 294 pp. 


and observe how genes assemble together 
to make a global genetic system, or genome. 
From here, one can see that the labor within 
the genome is not divided equally. Whereas 
many genes encode for proteins that perform 
a single monotonous task, such as breaking 
down a certain type of sugar or producing a 
specific skin pigment, there are others that 
serve such fundamental roles that their re- 
moval would lead to the crumbling of the 
genomic society altogether. Among the latter 
group are genes that manage the behavior of 
a host of other genes. 

When genes are mismanaged by their 
masters, organisms can be transformed in 
dramatic ways. For example, in humans, 
when SOX9 fails to direct its wide range of 
subordinates succinctly, sex reversal and 
skeletal malformations can occur. 

Given that catastrophic things tend to 
happen when genes don’t work together 
properly, changes to how the genomic so- 
ciety is run are a rare occurrence. When 
genes with new abilities evolve, Darwinian 
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Like members of human societies, genes “cooperate” 
and “compete” to promote survival. 


selection determines whether they will join 
the ranks as productive members of society. 
Our ancestors obtained genes that could in- 
terpret light as color and a gene for a more 
efficient oxygen-carrying hemoglobin in 
this very way. 

And then there are the genes that don’t 
contribute to society at all. Instead, they se- 
cure their position by hijacking the system. 
The LINE] gene, for example, encodes only 
for its own dispersal, copying and pasting 
itself throughout our genome while provid- 
ing the society with no clear benefit. The 
“bad behavior” of genes amounts to scandal 
in the genomic society, and learning about 
their exploits is one of the most enjoyable 
elements of reading the book. 

There are even genes that work to ensure 
the survival of individual cells within an or- 
ganism by wreaking havoc on others. In fruit 
flies, for example, a pair of genes involved 
in sperm production work in concert to 
produce both a poison and its antidote. The 
toxic compound is released from the cell, 
while the antidote is retained. In this way, 
surrounding sperm cells without the gene 
pair are killed. On reading about such sys- 
tems, one begins to realize that it’s not quite 
right to imagine our genome as some ideal- 
ized republic. This is a society that is easily 
compromised from within its own ranks. 

Despite the genome’s complexity, the 
authors are careful to keep the text acces- 
sible. Indeed, at times the reader may be re- 
minded of those rare high-school teachers 
who could reveal the simple beauty hidden 
within abstract scientific theories. 

The book’s greatest strength is its re- 
markable use of metaphor. However, at 
times, the comparisons confuse rather than 
clarify. There must be a simpler way to ex- 
plain retina biology than to compare it to 
both the structure of an Israeli kibbutz and 
the design of a color television, for example. 
But this is a minor shortcoming for a work 
that largely succeeds in translating the find- 
ings of an esoteric science into something 
that is easily understood. 

In the years since The Selfish Gene was 
published, the human genome has been se- 
quenced, along with the genomes of many 
other species. Indeed, probing one’s own 
genes is beginning to become routine. Thus, 
The Society of Genes represents a timely and 
welcome handbook for navigating this post- 
genomic era. 
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The paradigm shift, in perspective 


A new collection of essays takes a fresh look at Thomas Kuhn’s classic text 


By Sandra D. Mitchell 


n his 1962 review of Thomas Kuhn’s The 

Structure of Scientific Revolutions, his- 

torian Charles Gillispie proclaimed it “a 

very bold venture, this essay.’ In hind- 

sight, it seems not just bold but a pivotal 

point in our understanding of science. 
Kuhn’s book, like the scientific revolutions 
it documents, initiated a paradigm shift in 
the way we think about scientific practice. 
Rather than seeing scientific change solely as 
rational progress—a slow climb up the moun- 
tain of truth—we now view it 
as a socially produced, psy- 
chologically influenced, and 
somewhat disjointed change 
of explanatory frameworks. 
For decades, nearly every 
undergraduate course in his- 
tory, philosophy, or sociology 
of science used Structure as 
a central textbook. The book 
under review features essays 
contributed by historians, 
social scientists, and _phi- 
losophers that reflect on the 
development, content, and 
impact of Kuhn’s revolution- 
ary book. 

There is much that is new 
and intriguing in this di- 
verse volume. Some chapters 
invite us “inside the head” 
of Kuhn, through personal 
memories of his unique pedagogical style, 
his letters and notebooks, and his “Aristotle 
experience”: the moment he realized that Ar- 
istotelian physics made sense in the histori- 
cal context in which it was written. Others 
explore how psychological theories, Kuhn’s 
scientific work on radar during World War 
Il, and the Cold War culture influenced 
Kuhn’s philosophy. Still others focus on the 
text itself, examining how Kuhn redefined 
key concepts, including paradigms, revolu- 
tions, exemplars, and progress. The philoso- 
phy and sociology chapters offer different 
lenses through which to view Structure, but 
historical perspectives occupy center stage. 

Kuhn studied under the Nobel Prize-win- 
ning physicist and mathematician John Van 
Vleck at Harvard in the 1940s and later 
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joined Van Vleck and the Radio Research 
Laboratory in making jamming devices to 
interfere with German radar. Kuhn’s thesis 
research in solid-state physics led to a new 
approximation method that is still used to- 
day, but it was not field-changing by any 
stretch of the imagination. As historian Peter 
Galison puts it: “Kuhn’s whole surround was, 
to grab his own later term, ‘normal science.” 

After his election to Harvard’s Society of 
Fellows, Kuhn began teaching the history of 
science. It is in Kuhn’s notebooks from this 
time that we find key influences on Struc- 


Hotly debated among intellectuals and parodied in The New Yorker, The Structure of 
Scientific Revolutions caused a sensation in the mid-20th century. 


ture. Early on, Kuhn rejected the logical 
positivist view of science and began to draw 
parallels between the process of scientific 
change and the stages of cognitive develop- 
ment, as outlined by the clinical psychologist 
Jean Piaget. Galison deftly exposes the role 
that Kuhn’s own scientific practice played 
in framing the questions that motivated 
Structure and documents how experimental 
psychology became a major resource for an- 
swering those questions. 

In August 1950, researchers at the Univer- 
sity of Innsbruck in Austria conducted an 


Kuhn’s Structure of Scientific 
Revolutions at Fifty 
Reflections on a Science Classic 
Robert J. Richards and 
Lorraine Daston (Eds.) 
University of Chicago Press, 
2016. 234 pp. 
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experiment in which a subject wore a pair 
of mirrored glasses that completely inverted 
his view of the world for 10 days. Historian 
Lorraine Daston’s elegantly crafted chap- 
ter presents this experiment as parallel to 
Kuhn’s view of structural scientific trans- 
formation. Interspersed with her analysis 
of the rise and fall of the understanding of 
“structure,” she documents the experience of 
the experimental subject, who gradually pro- 
gressed from disorientation to adaptation. 

Daston argues that Kuhn’s search for 
“structure” in scientific change ultimately 
undermined his _histori- 
cal project. Historians who 
embrace Kuhn’s campaign 
against “Whiggishness’”— 
interpreting past science 
as good or bad through 
the narrow perspective of 
contemporary views—have 
sought to document scien- 
tific practice as “resolutely 
historical.” This, Daston ar- 
gues, has led to the current 
gulf between philosophy, 
sociology, and history of 
science. But this fragmen- 
tation need not be Kuhn’s 
final legacy. 

Daston challenges Kuhn’s 
view of paradigms as all- 
encompassing ways of see- 
ing the world that cannot 
be captured by rules, only 
learned through exemplars. Such an inter- 
pretation makes paradigm change a neces- 
sarily sudden, revolutionary gestalt shift. 
But perception isn’t like that, she argues. 
As the inverted glasses experiment reveals, 
adaptation to a radical change in perspec- 
tive is something that occurs gradually. 
If those who study science move outside 
Kuhn’s_ context-inflected interpretations, 
she writes, “his final conception of para- 
digm might yet yield a structure capacious 
enough to bring the history, philosophy and 
sociology of science—and much else—back 
together again.” 

Reflecting on the paradigm shift that 
Kuhn’s influential book initiated gives us 
new insight into the current and future state 
of science studies. 
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Basic science: 
Bedrock of progress 


ALMOST 4 YEARS ago, one of us (F.S.C.) 
wrote an Editorial (7) affirming the con- 
tinued importance of basic research to 
the National Institutes of Health (NIH) 
mission. The Editorial emphasized that 
basic scientific discovery is the engine that 
powers the biomedical enterprise, and 
NIH continues to spend more than half its 
budget supporting basic research projects. 
This is critical, because the private sector 
generally funds projects that yield a more 
rapid return on investment. 

Despite these assurances, some members 


of the community believe that NIH’s inter- 
est in basic science is flagging. For example, 
investigators have told us that the require- 
ment for a “Public Health Relevance” 
statement in every NIH research grant 
application suggests that every project must 
relate directly to a public health concern— 
that NIH places less value on projects that 
cannot be expected to yield an immediate 
public health benefit. This is simply not 
true. As we wrote in our Strategic Plan (2), 
we recognize that many of the most impor- 
tant medical advances trace back to basic 
research that had no explicit disease link. 
To address this concern, we have revised 
our application instructions (3) so that the 
Public Health Relevance statement reflects 
the NIH mission and our commitment 

to supporting a robust, diverse research 
portfolio, including the pursuit of basic 
biological knowledge. 

We are particularly concerned that 
misperceptions about NIH’s priorities and 
interests may be causing investigators to 
submit fewer basic research applications. 
For example, the National Institute of 
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Neurological Disorders and Stroke (NINDS) 
noticed a gradual and significant decline 
in the number of basic grants awarded 
between 1997 and 2012 (4). This decrease 
in awards was not a consequence of peer 
review given that basic grant applications 
actually did substantially better in review 
than applied research proposals. Instead, 
the major driver of this decline was a 
decrease in the number of fundamental 
basic research applications submitted. 

The taxpayer investment in NIH has 
yielded spectacular returns from basic 
science over the long term. These range 
from the discoveries of the low-density 
lipoprotein receptor (5) and the develop- 
ment of CRISPR-associated protein-9 
nuclease (6, 7) to recent substantial 
advances in structural biology through 
cryo-electron microscopy (8). For this 
track record of success to continue, we 
must continue our vigorous support of the 
pursuit of fundamental knowledge. All of 
NIHW’s senior leaders believe strongly that 
progress toward these goals occurs most 
rapidly when investigators pursue their 
passions, whether they lie in basic research 
or in applied, disease-focused studies. 

By supporting a broad portfolio of basic, 
translational, population, and clinical 
research, NIH will continue to lead the 
way toward a healthier future. 
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A research symbiont 


M. MCNUTT (“#IAMARESEARCHPARASITE,’ 
Editorial, 4 March, p. 1005) can be proud to 
be a “research parasite.” The creators of this 
term, Longo and Drazen (1), miss the very 
point of scientific research when they write 
that researchers may “even use the [open] 
data to try to disprove what the original 
investigators had posited.” It is at the core 
of the scientific paradigm that researchers 
take nothing as final truth. In fact, using 
research data to try to disprove a result is 
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good scientific practice, especially in light of 
the replication crisis (2-4). 

However, Longo and Drazen are right 
that scientific data sharing deserves 
recognition. They suggest that credit 
for data sharing should take the form of 
co-authorship, but co-authorship as the 
sole instrument of credit will unnecessar- 
ily restrict the potential of data sharing 
and could be a detriment to the original 
researcher (for instance, if the resulting 
publications lack quality) (5). In the case of 
replication studies and meta-analyses, co- 
authorship makes no scientific sense. 

Amore suitable instrument would be a 
much higher appraisal of data sharing by 
research communities through citations of 
data sets, awards, and the consideration 
of data “production” in career prospects, 
funding applications, and evaluations. 
With this end in mind, research parasites 
are beneficial for the organism as a whole. 
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Pseudonymous fame 


J. BOHANNON’S In Depth story “Fight over 
author pseudonyms could flare again” 
(26 February, p. 902) described prob- 
lems stemming from authors using the 
pseudonymous screen names under which 
they had done their research instead of 
their real names. The most famous case of 
pseudonymous authorship occurred over 
a century ago in the form of William Sealy 
Gosset’s famous 1908 paper “The prob- 
able error of a mean,” for which he used 
the simple pseudonym Student (7). This 
work set the stage for what is now known 
as Student’s ¢ test, a hypothesis-testing 
tool familiar to practically every analyst 
and statistician. Gosset was employed 
by Arthur Guinness and Sons brewery in 
Dublin, and legend holds that his use of a 
pen name was prompted by the com- 
pany’s concern for secrecy in their use of 
statistical methods for quality control. 
Gosset’s case notwithstanding, pseudo- 
nyms will hopefully not become a more 
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regular occurrence. Ephemeral screen 
names may be acceptable for Internet 
message boards, but their use in research 
papers may ultimately lower the public’s 
perception of the transparency, integrity, 
and timelessness of the permanent scien- 
tific record of human knowledge. 
Michael C. Wendl 
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Mathematics and Genetics, Washington University, 
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Comment on “Single-trial spike trains 
in parietal cortex reveal discrete steps 
during decision-making” 

Michael N. Shadlen, Roozbeh Kiani, 
William T. Newsome, Joshua I. Gold, 
Daniel M. Wolpert, Ariel Zylberberg, 
Jochen Ditterich, Victor de Lafuente, 
Tianming Yang, Jamie Roitman 

Latimer et al. (Reports, 10 July 2015, p. 

184) claim that during perceptual decision 
formation, parietal neurons undergo one- 
time, discrete steps in firing rate instead of 
gradual changes that represent the accumu- 
lation of evidence. However, that conclusion 
rests on unsubstantiated assumptions about 
the time window of evidence accumulation, 
and their stepping model cannot explain 
existing data as effectively as evidence-accu- 
mulation models. 

Full text at http://dx.doi.org/10.1126/science. 
aad3242 


Response to Comment on “Single-trial 
spike trains in parietal cortex reveal 
discrete steps during decision-making” 
Kenneth W. Latimer, Jacob L. Yates, 
Miriam L. R. Meister, Alexander C. Huk, 
Jonathan W. Pillow 

Shadlen et al’s Comment focuses on 
extrapolations of our results that were not 
implied or asserted in our Report. They 
discuss alternate analyses of average fir- 
ing rates in other tasks, the relationship 
between neural activity and behavior, and 
possible extensions of the standard models 
we examined. Although interesting to 
contemplate, these points are not germane 
to the findings of our Report: that step- 
ping dynamics provided a better statistical 
description of lateral intraparietal area 
spike trains than diffusion-to-bound 
dynamics for a majority of neurons. 

Full text at http://dx.doi.org/10.1126/science. 
aad3596 
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Comment on “Single-trial spike trains 
in parietal cortex reveal discrete steps 
during decision-making” 


Michael N. Shadlen,’* Roozbeh Kiani,” William T. Newsome,’ Joshua I. Gold,* 
Daniel M. Wolpert,” Ariel Zylberberg,° Jochen Ditterich,’ Victor de Lafuente,® 


Tianming Yang,? Jamie Roitman’? 


Latimer et al. (Reports, 10 July 2015, p. 184) claim that during perceptual decision 
formation, parietal neurons undergo one-time, discrete steps in firing rate instead of 
gradual changes that represent the accumulation of evidence. However, that conclusion 
rests on unsubstantiated assumptions about the time window of evidence accumulation, 
and their stepping model cannot explain existing data as effectively as evidence- 


accumulation models. 


atimer et al. (1) analyzed the spiking activity 
of neurons in the lateral intraparietal (LIP) 
area of parietal cortex and challenged the 
hypothesis that these neurons represent 
the accumulation of noisy evidence bearing 
on a perceptual choice (e.g., drift diffusion). They 
conclude that these neurons represent jumps (or 
steps) from a neutral to a high or low state that 
represents the upcoming choice. Accordingly, 
the ramplike activity of LIP neurons is an artifact 
caused by averaging step functions occurring at 
different times. Conceptually, their step model 
implies that LIP activity represents either (i) the 
outcome of the decision, corresponding to steps 
synchronized to the end of the process, or (ii) the 
decision process itself, corresponding to the pop- 
ulation average of all-or-none steps contributed 
by individual neurons at different times. Neither 
interpretation is consistent with existing data. 
The first interpretation is refuted by choice- 
reaction time (RT) experiments (2). Aligned to 
the beginning of deliberation, the across-trial 
averages of such steps would resemble a ramp. 
However, aligned to the end of the decision, syn- 
chronous steps should be obvious [e.g., figures 
2A and 3A in (DJ. The LIP data are inconsistent 
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with this idea (Fig. 1A): trials with long RT do not 
hover in a neutral state until the end of the de- 
cision [see also (3)]. 

The second interpretation could explain the 
ramps aligned to saccadic responses in the RT 
experiments. However, this interpretation is in- 
consistent with other experiments in which a 
sequence of shapes replaces random-dot motion 
to furnish discrete packets of evidence. Under these 
conditions, LIP neurons do not step to stereotyped 
high or low states. Instead, they produce graded 
responses throughout the decision according to 
the sign and strength of the evidence provided by 
the current shape (Fig. 1B). Further, the graded 
population responses are not simply a mixture of 
high and low steps (4, 5). If they were, the change 


=~ 
no 
=~ 
a 
od) 
x 
ox 
no 
S 
) 
P 
@ 
= 
[@)) 
c 
= : 
LL - Targets on 
‘Fix 
-500 0 


Time from saccade (ms) 200 ms_ 


in firing rate induced by a shape should diminish 
for later shapes, because the neuron is more like- 
ly to have already stepped. This is clearly incor- 
rect [see figures 3B and 4B in (4)]. Thus, LIP 
neurons encode multiple small, noisy changes 
in evidence (not one-time, all-or-nothing steps) 
in amanner consistent with diffusion or random- 
walk dynamics. 

These points question the conclusions in (J). 
Then why do their analyses suggest stepping? 
Parietal activity can step in the context of quickly 
planned eye movements to visual targets (6, 7). In 
contrast, diffusion (ramping) dynamics arise when 
the decision to make such an eye movement re- 
sults from the temporal integration of evidence 
over a more prolonged interval. Therefore, before 
using models to identify (or refute) neural corre- 
lates of an integration-based decision process, it 
is essential to (i) know that the neural activity in 
question is occurring in a behavioral context that 
is actually based on prolonged integration and 
(ii) focus any model comparison on the epoch in 
which this integration occurs. 

Unfortunately, it is difficult to estimate the 
integration times from the behavioral data in (7). 
They did not use an RT experiment, and their 
monkey’s accuracy is flat over the viewing du- 
rations they tested (Fig. 2, filled stars). It is possible 
to deduce integration times from a follow-up ex- 
periment in the same monkey, using a broader 
range of durations (Fig. 2, open symbols). Fitting 
these data with bounded diffusion (curves) yields 
a median integration time of ~250 ms (across all 
motion strengths). However, the monkey’s accuracy 
is substantially worse in the earlier data, analyzed 
in (J). One possibility is that the poorer accuracy 
is explained by a combination of guessing and 
overall lower sensitivity—partially compensated 
by an elevated decision bound—whose net effect 
is longer integration times (~310 ms). Alterna- 
tively, the poor accuracy is explained by brief 
integration times (~70 ms) or possibly a different 
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Fig. 1. Experimental evidence in support of a gradual 


accumulation of evidence in LIP. (A) LIP neurons ramp, 


on average, during an RT task. Averages are sorted by RT quantile (color), using trials in which the monkey 
chose the direction associated with the choice target in the neuron's response field. [Modified from (2), 
showing responses from ~200 ms after stimulus onset; see also figure 2, B and D, in (11)]. (B) LIP neurons 
undergo multiple incremental changes in firing rate on single trials. On this example trial, the monkey 
decided in favor of the green target in the neuron’s response field, consistent with the accumulated 
evidence from the sequence of shapes [from movie 3 of (4)]. [For more single-trial examples, see the 
movies in (4) and movies 1 and 2 in (5). For population analyses, see (4, 5).] 
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Fig. 2. Behavioral integration times are difficult to determine from the analyzed data set but are 
certainly shorter than the full 500- to 1000-ms viewing durations. Open circles correspond to 
behavioral data obtained after the collection of the neural recordings [figure 7D in (12)]. Smooth curves 
show fits of a bounded diffusion model, from which we estimate the median decision time to be ~250 ms 
across all motion strengths [methods explained in (13, 14)]. The neural data analyzed by Latimer et al. (1) 
accompanied the behavioral data shown by the filled stars [from figure 7A of (12)]. Accuracy was un- 
affected by viewing duration over the range tested, and overall performance was markedly poorer in this 
data set. [Data are from figure 7, A and D, in (12), with missing coherences kindly supplied by Latimer et al. ] 


strategy altogether, in which the monkey waits 
for salient features (extrema) in the random 
dots. These latter alternatives are consistent with 
our experience training monkeys on these kinds 
of tasks. 

Most important, regardless of whether the in- 
tegration times are 70 ms or 300 ms, they are 
substantially shorter than the 500- to 1000-ms 
duration of the trials. Accordingly, integration does 
not need to start at a consistent time within a 
trial. This potential variability exposes a critical 
bias in the model comparison: The stepping mod- 
el is allowed the flexibility to account for random 
times of transitions, but the diffusion model is 
tethered to a fixed start time and therefore is 
unfairly penalized in comparison. A relatively 
short integration window occurring at random 
times during motion viewing can also explain 
other features of (7): the broad distribution of the 
time of the putative steps, the absence of a de- 
pendency of step times on motion strength, the 
pattern of response variance [figure 4A in (2)], 
the superior choice predictions of the step mod- 
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el, and its superior deviance information cri- 
terion (DIC). 

Latimer et al. attempt to mitigate some of 
these concerns in their supplementary analysis 
of data from an RT experiment (2). However, that 
analysis also does not convincingly support a 
stepping model. Of the 16 neurons (of 54) chosen 
for analysis, only 10 exhibited the kind of coherence- 
and choice-dependent ramping that is the focus 
of the model comparison. Of these, four support 
diffusion. Moreover, the average ADIC in favor of 
steps is small (~19; 10 excluding the outlier), even 
though the comparison is biased toward that re- 
sult: (i) the data include many high-motion strength 
trials with brief integration times (e.g., 12% of 
included trials have integration times <150 ms) 
that are likely to be seen as steps; (ii) their integ- 
ration model assumes that the starting time of 
integration is fixed, despite the fact that it varies 
considerably across neurons [see figure $22 in (D)]; 
and (iii) their own simulations [figure S6 in (J)] 
show that their analysis can produce evidence 
for stepping even under simulations of diffusion. 


Identifying the sources of these biases, including 
possibly their model’s handling of negative-going 
rates (which are neither bounded nor stopped 
like the positive-going rates) [supplementary ma- 
terials section 2.1, figure 1B, and figure S9 in (2)] 
and the inability to identify latent firing rates 
from the parameters of diffusion, should be ad- 
dressed before applying these methods to richer 
data sets. 

In summary, Latimer et a. present a statistical 
method for inferring discrete steps in firing rate 
from single neurons [similar to (8-10)] and use it 
to claim that averages of random steps are re- 
sponsible for the evolving firing of LIP neurons 
during deliberative decision-making. However, 
they have not supported this claim, and they have 
not provided a plausible explanation for many 
experimental observations supporting the repre- 
sentation of accumulated noisy evidence by single 
neurons in LIP. At present, bounded diffusion 
provides the best account of the ensemble of neu- 
ral data. 
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Response to Comment on “Single-trial 
spike trains in parietal cortex reveal 
discrete steps during decision-making” 


Kenneth W. Latimer,””’? Jacob L. Yates,”” Miriam L. R. Meister,” 
Alexander C. Huk,””*** Jonathan W. Pillow”** 


Shadlen et al.s Comment focuses on extrapolations of our results that were not implied or 
asserted in our Report. They discuss alternate analyses of average firing rates in other 
tasks, the relationship between neural activity and behavior, and possible extensions of the 
standard models we examined. Although interesting to contemplate, these points are not 
germane to the findings of our Report: that stepping dynamics provided a better statistical 
description of lateral intraparietal area spike trains than diffusion-to-bound dynamics 


for a majority of neurons. 


e organize our Response to Shadlen et al. (1) 

around four topics: (i) comparisons to 

other experiments, (ii) the integration 
behavior of our animals, (iii) alternative 
formulations of the drift-diffusion model, 

and (iv) interpretation of data from Roitman and 
Shadlen (2), followed by (v) technical comments. 
(i) Shadlen et al. assert that our main finding 
in Latimer et al. (3) is inconsistent with other 
experiments and analyses. Their figure 1A shows 
saccade-aligned peristimulus time histograms 
(PSTHs) from a reaction time (RT) motion dis- 
crimination task. First, a PSTH (an average over 
trials) cannot provide definitive evidence about 


Fig. 1. Our animal’s behavior A 
reflects a period of evi- 

dence accumulation as long 1 
or longer than durations 

reported previously. 0.9 
(A) Monkey's accuracy as 
a function of stimulus 
duration during neuro- 
physiological recording 
sessions (dots) from (5) 
overlaid with the theoretical 
curves (solid lines) obtained 0.6 
from a maximum likelihood 

fit of a DDM. The median 

durations of behavioral 0 
integration under this model 

are 408, 362, and 152 ms 

across the range of motion 


0.8 


0.7 


probability correct 


strengths shown. These integration times are in fact longer than those 
recently reported in (10). A modest lapse rate in the DDM accounts for 
the asymptotic performance slightly below 100% [e.g., (7 8, 15)]. The 
other monkey (not discussed by Shadlen et al.) exhibits similar signatures of 
substantial evidence accumulation. (B) This animal’s behavior is also very 
similar to that of previous studies. Dependency of accuracy on viewing 
duration and motion strength [same conventions as in (A)] from a purely 
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DDM fits to behavior during ephys recordings B 


the dynamics governing firing on single trials, 
a primary point of motivation for our Report. 
Second, their figure 1A simply shows that spike 
rate steps are not precisely aligned with saccade 
times. Our Report made no assumptions and 
drew no conclusions about the relationship be- 
tween spike rates and decisions or saccades— 
in fact, both stepping and ramping models were 
fit without knowledge of the animal’s choices. 
Moreover, we obtained an identical stepping 
result with spike trains from this very data set 
(2), indicating that there is nothing inconsistent 
about a finding of stepping dynamics with ramp- 
ing saccade-aligned PSTHs. 


motion 
strength 


51.2% 


probability correct 


——— 3.2% 


500 1000 0 


stimulus duration (ms) 


Shadlen et al. 


Shadlen et al. then argue that lateral intrapari- 
etal (LIP) area activity during a different decision- 
making task (4) conflicts with our results. We 
fail to see the conflict: We did not analyze data 
from this task, and it is entirely conceivable that 
LIP neurons exhibit different dynamics in dis- 
tinct contexts. Regardless, Kira et al. (4) analyzed 
population-averaged responses over many trials, 
and this cannot provide evidence for or against 
stepping single-trial dynamics. Although figure 
1B from Shadlen et al. overlays spikes with in- 
stantaneous decision evidence, this spike train 
appears to be hand-picked, and no single-trial 
spike trains were analyzed in that paper. 

(ii) Shadlen et al. argue that our behavioral 
data (5) might reflect brief or variable evidence 
accumulation. However, a drift-diffusion model 
(DDM) fit confirms that the monkey’s behavior 
during electrophysiological recordings met or ex- 
ceeded conventional periods of integration (Fig. 
1A), and is nearly indistinguishable from that in 
Kiani et al. (6) when assessed with shorter viewing 
durations in sessions outside of electrophysiology 
(Fig. 1B). The slightly lower accuracy observed 
during electrophysiology sessions is not evidence 
that the monkey employed two different decision- 
making strategies and instead is likely a simple 
consequence of differences in stimulus geome- 
try. Specifically, our neural recording sessions 
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behavioral data set collected from the same monkey (5) (grayscale dots) 
using a range of shorter durations that closely match those used in a 
previous study (6) (colored dots). The data from the two studies are 
very similar (i.e., they overlap and follow matching forms), demonstrat- 
ing that our monkey achieves performance on the random dots task 
nearly identical to that reported in a study by some of the authors of 
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employed unique target and dot-motion aperture 
locations and motion directions tailored to each 
neuron under study, whereas the shorter-duration 
behavior (shown in Fig. 1B) had consistent cardinal 
stimulus geometry across sessions. This difference 
in stimulus geometry strikes us as a more plausible 
account than Shadlen et al’s suggestion that this 
monkey only employed conventional temporal 
integration when we were not recording from 
LIP. [Additionally, unlike many LIP studies in 
which dot motion is presented at central fixation 
—e.g., (2, 6—9)—we presented the dots peripherally, 
better encouraging broad spatiotemporal integra- 
tion and avoiding individual dot tracking that is 
possible in high-resolution central vision.] 

(ii) Shadlen et ai. call for an alternative diffusion- 
to-bound model with variable per-trial integra- 
tion start times. We constructed our model to be 
faithful to previous formulations in the LIP 
literature. Shadlen and Kiani (0) stated, “There’s 
a reproducible starting time ~200 ms after the 
onset of motion.” Applications of the DDM for 
behavior typically use a fixed start time [e.g., 
(4, 9, 1D]. Many LIP studies analyze spike counts 
averaged across trials and neurons, thereby as- 
suming that integration begins at the same time 
on each trial and in each neuron [e.g., (4, 12)]. 
Without this assumption, average spike rates 
would reflect a mixture of temporally shifted 
ramping trajectories and preintegration activity, 
instead of a coherent ramping process (13). More- 
over, the claim that the fixed start time “unfairly 
penalized” the ramping model is incorrect. The 
stepping model does not have a flexible start time. 
Both models describe spike trains in terms of a 
conditionally Poisson process beginning at a 
fixed time after motion onset on every trial and 
evolves according to discrete stepping or contin- 
uous ramping dynamics for the remainder of the 
trial. We did test a range of start times for both 
models for every cell, with no noticeable changes 
to our results [figure S18 in (3)]. 

(iv) Shadlen et al. claim that our supplemen- 
tary analysis of data from Roitman and Shadlen 
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(2) included cells inappropriate for studying evi- 
dence accumulation. We are perplexed by this 
argument because it appears to conflict with 
conclusions about LIP function made by Shadlen 
et al. from this data set (point i above). All the 
cells that we analyzed were taken from the same 
data set used to construct Shadlen et al.’s figure 
1A. Additionally, Shadlen et al. posit that trials 
with integration times under 150 ms biased our 
results, but we only analyzed trials with RTs of at 
least 350 ms [see the supplementary materials 
for (3)]. Shadlen et al. assert that the average 
ADIC (deviance information criterion) is small 
(ADIC = 19), but this value is clearly in favor of 
stepping; ADIC = 10 is conventionally regarded 
as “strong” statistical evidence (14). 

(v) Shadlen e¢ al. assert that our ramping mod- 
el simulations produce evidence for stepping. 
The figure in question [figure S6 in (3)] shows 
that for ramping simulations (based on our fits 
and trial counts from real data), the distribution 
of ADICs strongly supports ramping. A few small 
positive values indicated that the two models are 
not always identifiable given limited data (3 of 
40 individual-neuron simulations yielded negli- 
gible evidence for stepping; ADICs < 3), which 
does not undermine the consistency of our ana- 
lyses. Finally, Shadlen et al. argue that our model 
comparison is biased because we cannot “iden- 
tify latent firing rates” in the ramping model. 
Frankly, we do not understand this remark. Our 
Bayesian fitting methods integrate over all pos- 
sible latent rates consistent with the data, for 
both models. 

In summary, (i) hypotheses about how LIP 
dynamics relate to decision formation are intriguing 
and worthy of future investigation but not rel- 
evant to our statistical analyses; (ii) our monkey 
behavior and modeling assumptions match pre- 
vious studies of LIP, although we certainly wel- 
come future standardization and generalization 
of experimental and theoretical protocols; and 
(iii) our reanalysis of data from Roitman and 
Shadlen still supported the stepping model. We 


stand behind the conclusions of our Report and 
believe that considering alternative hypotheses 
to integration will continue to be illuminating. 
We of course agree with Shadlen et al. that ex- 
trapolations of our original study’s character- 
izations of single-neuron spike trains to that of 
population-level dynamics and/or to decision- 
making would be premature. These important 
issues require consideration of richer multineu- 
ron data sets, which we have recently collected 
and are currently analyzing. The ongoing intro- 
duction of powerful new tools and data sets will 
likely bring continued constructive debate, and 
we share Shadlen et al.’s enthusiasm for testing 
and generalizing theories that link brain and 
cognition. 
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Geri Richmond and Hashemite University molecular biologist Rana Dajani spoke after Richmond’s AAAS presidential address to open the 2016 AAAS Annual Meeting. 


AAAS annual meeting demonstrates the critical 
value of global scientific collaboration 


By Becky Ham 


Some of the most intriguing news at the 2016 AAAS Annual Meet- outbreaks, said outgoing AAAS President Geri Richmond in her 


ing focused on the tiny: a miniscule cosmic ripple born 1.5 billion address at the start of the meeting. 

years ago, and a millimeters-long mosquito responsible for an The best science happens, Richmond said, “when everyone is at 

emerging health crisis. But the science behind these discoveries the table and has an equal voice, when creativity flows with differ- 
is huge in scope and in importance, reflecting the ent perspectives from different countries, different 

ongoing achievement of international research institutions, and different backgrounds.” 

teams addressing complex challenges in science 3 The increasingly global nature of science became 


and society. 

Efforts to track the spread of Zika virus in the 
Americas, and the landmark discovery of gravi- 
tational waves, both demonstrate the power and 
potential—and the need—for global collaborations 
between scientists, speakers emphasized at the 
11 to 15 February event, held in Washington, DC. 

In particular, scientists in developing countries 
must work as equal partners with their coun- 


clear to Christopher Dye in 2015 when he walked 
into an Ebola response team meeting in Monrovia, 
Liberia. “I counted 12 nationalities in the room,” 
said Dye, director of strategy in the Office of the 
Director General at the World Health Organiza- 
tion. “These were people who had never worked 
together, and never met each other before. They 
came from very diverse backgrounds, and yet they 
worked together immediately.” 


terparts in developed countries to solve border- NIAID Director Anthony Fauci “And if you want to think about Zika virus—be- 
crossing challenges like climate change and virus proposed new Zika funding. cause that’s what worries us at the moment—will 
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Zika be a story of global science engagement and global engage- 
ment in other forms?” Dye asked in his AAAS plenary address. 

Dye joined a panel of public health experts who spoke at the 
meeting on the troubling increase in Zika virus infections and as- 
sociated neurological disorders such as microcephaly and Guillain- 
Barré syndrome in Central and South America. After the World 
Health Organization (WHO) declared the Zika outbreak a public 
health emergency of international concern on 1 February, teams of 
international researchers have traveled to the region to clarify the 
connections between the virus and the disorders, and to look for 
ways to prevent Zika’s spread. Panel speakers said the first results 
from these studies, including a report from Brazil on children with 
microcephaly, are expected in April. 

In a AAAS news briefing, National Institute of Allergy and 
Infectious Diseases (NIAID) Director Anthony Fauci noted 
that President Barack Obama has asked the U.S. Congress for a 
$1.8 billion budget supplement to fight the virus outbreak, which 
would include funding to the National Institutes of Health to help 
develop a Zika vaccine. 

Speakers at the meeting also 
said that challenges such as 
food shocks associated with 
the extreme-weather effects of 
climate change require a coor- 
dinated international response. 
There is a growing likelihood 
of a weather-driven food shock, 
where major harvests of a 
staple crop such as soy, maize, 
wheat, or rice fail and signifi- 
cantly drive up the crop’s price 
on the global market, warned 
Tim Benton of the United 
Kingdom’s Global Food Security 
Program. The global food trade 
network can potentially amplify 
the risks of weather-driven 
food shocks, hitting hardest in 
poor and import-dependent 
countries such as those in sub- 
Saharan Africa, he said. 

Benton and his colleagues 
have been working with govern- 
ment leaders and policy-makers 
around the world to address 
and manage food shock risks. 
“We ask, ‘Are you prepared 
for the consequences?’ And 
typically the answer is ‘No,” he 
said. “Governments are listening, but they aren’t engaged as much 
as we would like them to be.” 

The researchers called for an international plan of action to 
make the food system more resilient, such as adopting contingency 
plans to store key crops or divert certain crops from biofuel use 
when demand reaches a critical level. Biotechnology used to de- 
velop drought-resistant crops, better agricultural practices such as 
crop rotation, and more sustainable use of resources can also help 
agriculture adapt to extreme weather, they said. 

In other sessions throughout the meeting, scientists emphasized 
how their international collaborations have led to productive 
diplomatic relationships between countries. In one such gathering, 
key researchers with the Synchrotron-light for Experimental Sci- 
ence and Applications in the Middle East (SESAME) talked about 
how their center will nourish both technological expertise and 
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understanding among people with diverse religions and political 
systems in the region. 

Synchrotron particle accelerators such as SESAME produce elec- 
tromagnetic radiation that can be useful in probing the structures 
of complex proteins, identifying the chemicals in a new drug, or 
developing materials that capture pollutants. SESAME’s current 
members include Bahrain, Cyprus, Egypt, Iran, Israel, Jordan, 
Pakistan, the Palestinian Authority, and Turkey. More than a dozen 
nations, including the United States, act as advisors to the project. 

“The first news is that it exists at all,” said Chris Llewellyn- 
Smith, a former director-general of CERN, the European Orga- 
nization for Nuclear Research. Llewellyn-Smith is president of 
the international council that is leading the SESAME project. 

“T believe I am correct in saying that I chair the only body in the 
world outside of the United Nations that has representatives of 
Iran and Israel in the same room.” 

When it begins operations later this year, SESAME will join 
a growing number of other “megascience projects” dotting the 
globe, including those involved 
in the blockbuster discovery 
of gravitational waves by the 
Laser Interferometer Gravita- 
tional-Wave Observatory, or 
LIGO project. 

The existence of gravita- 
tional waves—ripples in the 
fabric of spacetime—were pre- 
dicted 100 years ago by Albert 
Einstein, who thought that the 
effect would be too small to 
ever detect. At a special session 
convened at the AAAS meeting, 
Gabriela Gonzalez, a professor 
of physics at Louisiana State 
University and spokesperson 
for the LIGO Scientific Col- 
laboration, described how a 
massive international effort by 
more than 1000 scientists from 
16 countries finally observed 
the signature of a gravita- 
tional wave as it brushed over 
Earth. The LIGO collaboration 
is poised to expand glob- 
ally, Gonzalez said, with the 
VIRGO detector in Italy and 
the KAGRA detector in Japan 
now under construction. A few 
days after the historic LIGO an- 
nouncement, the cabinet of Indian Prime Minister Shri Narendra 
Modi gave approval to a new detection facility called LIGO-India. 

The 182nd AAAS meeting drew nearly 10,000 attendees from 
60 countries to participate in research presentations, career 
workshops, and special events such as Family Science Days. At 
that event, more than 3000 children and adults had a chance to 
engage with dozens of scientists and explore 30 interactive science 
exhibits. The meeting also included four “edit-a-thons” to encour- 
age attendees to edit and create scientific content for the website 
Wikipedia, as well as discussions at the forum website Reddit, 
where scientists attending the meeting fielded the public’s ques- 
tions on robots, neutrinos, and addiction. ® 


With reporting by Andrea Korte, Earl Lane, Jean Mendoza, 
and Juan David Romero 
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INTRODUCTION: In 1984, the simplest cells 
capable of autonomous growth, the mycoplas- 
mas, were proposed as models for understanding 
the basic principles of life. In 1995, we reported 
the first complete cellular genome sequences 
(Haemophilus influenza, 1815 genes, and Myco- 
plasma genitalium, 525 genes). Comparison of 
these sequences revealed a conserved core of 
about 250 essential genes, much smaller than 
either genome. In 1999, we introduced the meth- 
od of global transposon mutagenesis and ex- 
perimentally demonstrated that MZ. genitalium 
contains many genes that are nonessential for 
growth in the laboratory, even though it has the 
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Four design-build-test cycles produced JCVI-syn3.0. 
(A) The cycle for genome design, building by means 
of synthesis and cloning in yeast, and testing for via- 
bility by means of genome transplantation. After each 
cycle, gene essentiality is reevaluated by global trans- 
poson mutagenesis. (B) Comparison of JCVI-syn1.0 
(outer blue circle) with JCVI-syn3.0 (inner red circle), 
showing the division of each into eight segments. The 
red bars inside the outer circle indicate regions that 
are retained in JCVI-syn3.0. (C) A cluster of JCVI-syn3.0 
cells, showing spherical structures of varying sizes 


(scale bar, 200 nm). 
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smallest genome known for an autonomously 
replicating cell found in nature. This implied 
that it should be possible to produce a minimal 
cell that is simpler than any natural one. Whole 
genomes can now be built from chemically syn- 
thesized oligonucleotides and brought to life by 
installation into a receptive cellular environ- 
ment. We have applied whole-genome design 
and synthesis to the problem of minimizing a 
cellular genome. 


RATIONALE: Since the first genome sequences, 
there has been much work in many bacterial 
models to identify nonessential genes and 


JCVI-syn1.0 
' 1,078,809 bp 


8 


N\ 
400000 


define core sets of conserved genetic func- 
tions, using the methods of comparative ge- 
nomics. Often, more than one gene product can 
perform a particular essential function. In such 
cases, neither gene will be essential, and neither 
will necessarily be conserved. Consequently, 
these approaches cannot, by themselves, iden- 
tify a set of genes that is sufficient to constitute 
aviable genome. We set out to define a minimal 
cellular genome experimentally by designing and 
building one, then testing it for viability. Our 
goal is a cell so simple that we can determine the 
molecular and biological function of every gene. 


RESULTS: Whole-genome design and synthe- 
sis were used to minimize the 1079-kilobase 
pair (kbp) synthetic genome of M. mycoides 
JCVI-syn1.0. An initial design, based on collective 
knowledge of molecular biology in combination 

with limited transposon 
mutagenesis data, failed 
Read the full article 1 produce a viable cell. 
at http://dx.doi. Improved transposon mu- 
org/10.1126/ tagenesis methods revealed 
sclence.aad6253 a class of quasi-essential 
aa arhdtieresteerimaccnabenens pons that are deeded for 
robust growth, explaining the failure of our 
initial design. Three more cycles of design, syn- 
thesis, and testing, with retention of quasi- 
essential genes, produced JCVI-syn3.0 (531 kbp, 
473 genes). Its genome is smaller than that of 
any autonomously replicating cell found in 
nature. JCVI-syn3.0 has a doubling time of 
~180 min, produces colonies that are morpho- 
logically similar to those of JCVI-syn1.0, and 
appears to be polymorphic when examined 
microscopically. 


CONCLUSION: The minimal cell concept 
appears simple at first glance but becomes 
more complex upon close inspection. In addi- 
tion to essential and nonessential genes, there 
are many quasi-essential genes, which are not 
absolutely critical for viability but are nevertheless 
required for robust growth. Consequently, during 
the process of genome minimization, there is a 
trade-off between genome size and growth rate. 
JCVI-syn3.0 is a working approximation of a 
minimal cellular genome, a compromise be- 
tween small genome size and a workable growth 
rate for an experimental organism. It retains 
almost all the genes that are involved in the syn- 
thesis and processing of macromolecules. Un- 
expectedly, it also contains 149 genes with 
unknown biological functions, suggesting the 
presence of undiscovered functions that are es- 
sential for life. JCVI-syn3.0 is a versatile plat- 
form for investigating the core functions of life 
and for exploring whole-genome design. & 
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We used whole-genome design and complete chemical synthesis to minimize the 
1079-kilobase pair synthetic genome of Mycoplasma mycoides JCVI-syn1.0. An initial 
design, based on collective knowledge of molecular biology combined with limited 
transposon mutagenesis data, failed to produce a viable cell. Improved transposon 
mutagenesis methods revealed a class of quasi-essential genes that are needed for robust 
growth, explaining the failure of our initial design. Three cycles of design, synthesis, and 
testing, with retention of quasi-essential genes, produced JCVI-syn3.0 (531 kilobase pairs, 
473 genes), which has a genome smaller than that of any autonomously replicating cell 
found in nature. JCVI-syn3.0 retains almost all genes involved in the synthesis and 
processing of macromolecules. Unexpectedly, it also contains 149 genes with unknown 
biological functions. JCVI-syn3.0 is a versatile platform for investigating the core functions 


of life and for exploring whole-genome design. 


ells are the fundamental units of life. The 

genome sequence of a cell may be thought 

of as its operating system. It carries the 

code that specifies all of the genetic func- 

tions of the cell, which in turn determine 
the cellular chemistry, structure, replication, and 
other characteristics. Each genome contains in- 
structions for universal functions that are com- 
mon to all forms of life, as well as instructions 
that are specific to the particular species. The 
genome is dependent on the functions of the cell 
cytoplasm for its expression. In turn, the proper- 
ties of the cytoplasm are determined by the in- 
structions encoded in the genome. The genome 
can be viewed as a piece of software; DNA se- 
quencing allows the software code to be read. In 
1984, Morowitz proposed the simplest cells ca- 
pable of autonomous growth, the mycoplasmas, 
as models for understanding the basic principles 
of life (7). A key early step in his proposal was the 
sequencing of a mycoplasma genome, which 
we accomplished for Mycoplasma genitalium 
in 1995 (2). Even with the sequence in hand, 
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deciphering the operating system of the cell 
was a daunting task. 

We have long been interested in simplifying 
the genomic software of a bacterial cell by elim- 
inating genes that are nonessential for cell growth 
under ideal conditions in the laboratory. This 
facilitates the goal of achieving an understanding 
of the molecular and biological function of every 
gene that is essential for life. To survive in nature, 
most bacterial cells must be capable of adapting 
to numerous environments. Typical well-studied 
bacteria such as Bacillus subtilis and Escherichia 
coli carry 4000 to 5000 genes. They are highly 
adaptable, because many of their genes provide 
functions that are needed only under certain 
growth conditions. Some bacteria, however, grow 
in restricted environments and have undergone 
genome reduction over evolutionary time. They 
have lost genes that are unnecessary in a stable 
environment. The mycoplasmas, which typically 
grow in the nutrient-rich environment of animal 
hosts, have the smallest known genomes of any 
autonomously replicating cells. A comparison 
of the first two available genome sequences, 
Haemophilus influenzae [1815 genes (3)] and M. 
genitalium [the smallest known mycoplasma ge- 
nome; 525 genes (2)], revealed a common core of 
only 256 genes, much smaller than either ge- 
nome. This was proposed to be the minimal gene 
set for life (4). 

In 1999, to put this comparative study to an 
experimental test, we introduced the method 
of global transposon mutagenesis (5), which al- 
lowed us to catalog 150 nonessential genes in 
M. genitalium (6) and predict a set of 375 essen- 


tial genes. These results showed that it should 
be possible to produce a minimal genome that 
is smaller than any found in nature, but that the 
minimal genome would be larger than the com- 
mon set of 256 genes. At that time, we proposed 
to create and test a cassette-based minimal arti- 
ficial genome (5). We have been working since 
then to produce the tools needed to accomplish 
this. We developed methods to chemically syn- 
thesize the M. genitalium genome (7). However, 
M. genitalium grows very slowly, so we turned 
to the faster-growing M. mycoides genome as 
our target for minimization. We developed the 
method of genome transplantation, which al- 
lowed us to introduce M. mycoides genomes, as 
isolated DNA molecules, into cells of a different 
species, M. capricolum (8, 9). In this process, the 
M. capricolum genome is lost, resulting in a cell 
containing only the transplanted genome. In 2010, 
we reported the complete chemical synthesis and 
installation of the genome of M. mycoides JCVI- 
syn1.0 [1,078,809 base pairs (bp) (J0); hereafter 
abbreviated syn1.0). This genome was an almost 
exact copy of the wild-type M. mycoides genome, 
with the addition of a few watermark and vector 
sequences. 

Genome reduction in bacteria such as E. coli 
and B. subtilis has previously been achieved by 
a series of sequential deletion events (11, 12). 
After each deletion, viability, growth rate, and 
other phenotypes can be determined. In con- 
trast to this approach, we set out to design a 
reduced genome, then build and test it. We 
initially designed a hypothetical minimal ge- 
nome (HMG) based on a combination of exist- 
ing transposon mutagenesis and deletion data 
(13) and cumulative knowledge of molecular 
biology from the literature (14). 

We designed the genome to be built in eight 
segments, each of which could be independently 
tested for viability in the context of a seven- 
eighths syn1.0 genome (i-e., a syn1.0 genome that 
is seven-eighths complete). Initially, only one of 
the designed HMG segments produced a viable 
genome. Improvements to our global transposon 
mutagenesis method allowed us to reliably clas- 
sify genes as essential or nonessential and to 
identify quasi-essential genes that are needed for 
robust growth, though not absolutely required 
[figs. S1 to S4 (9); a similar result in M. pneumoniae 
is presented in (15)]. We also established rules for 
removing genes from our genome design with- 
out disturbing the expression of the remaining 
genes. Methods that we developed in the course 
of building syn1.0 (10) provide a way to build a 
new genome as a centromeric plasmid in yeast 
and to test it for viability and other phenotypic 
traits after transplantation into an M. capricolum 
recipient cell. These methods, along with improve- 
ments described here, make up a design-build- 
test (DBT) cycle (Fig. 1). 

Here we report a new cell, JCVI-syn3.0 (ab- 
breviated syn3.0), that is controlled by a 
531-kilobase pair (kbp) synthetic genome that 
encodes 438 proteins and 35 annotated RNAs. 
It is a working approximation to a minimal cell. 
Its genome is substantially smaller than that of 
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Fig. 1. The JCVI DBT cycle for bacterial genomes. At each cycle, the genome is built as a centromeric 
plasmid in yeast, then tested by transplantation of the genome into an M. capricolum recipient. In this 
study, our main design objective was genome minimization. Starting from syn1.0, we designed a reduced 
genome by removing nonessential genes, as judged by global Tnd5 gene disruption. Each of eight reduced 
segments was tested in the context of a seven-eighths syn1.0 genome and in combination with other 
reduced segments. At each cycle, gene essentiality was reevaluated by Tn5 mutagenesis of the smallest 
viable assembly of reduced and syn1.0 segments that gave robust growth. 


M. genitalium, and its doubling rate is about five 
times as fast. 


Preliminary knowledge-based HMG 
design does not yield a viable cell 


In our first attempt to make a minimized cell, 
we started with syn1.0 (10) and used informa- 
tion from the biochemical literature, as well as 
some transposon mutagenesis data, to produce 
a rational design. Genes that could be disrupted 
by transposon insertions without affecting cell 
viability were considered to be nonessential. Based 
on ~16,000 transposon 4001 (Tn4007) and Tn5 
insertions into the synl.0 genome, we were 
able to find and delete a total of 440 apparently 
nonessential genes from the syn1.0 genome. The 
resulting HMG design was 483 kbp in size and 
contained 432 protein genes and 39 RNA genes 
(database S1 includes a detailed gene list). 

In the course of designing the HMG, we de- 
veloped a simple set of deletion rules that was 
used throughout the project. (i) Generally, the 
entire coding region of each nonessential gene 
was deleted, including start and stop codons. 
(Exceptions are described below.) (ii) When a 
cluster of more than one consecutive gene was 
deleted, the intergenic regions within the clus- 
ter were deleted also. (iii) Intergenic regions 
flanking a deleted gene or a consecutive cluster 
of deleted genes were retained. (iv) If part of a 
gene to be deleted overlapped a retained gene, 
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that part of the gene was retained. (v) If part of 
a gene to be deleted contained a ribosome bind- 
ing site or promoter for a retained gene, that 
part of the gene was retained. (vi) When two 
genes were divergently transcribed, we assumed 
that the intergenic region separating them con- 
tained promoters for transcription in both direc- 
tions. (vii) When a deletion resulted in converging 
transcripts, a bidirectional terminator was in- 
serted, if one was not already present. 

Because of the possibility of design flaws, we 
divided the genome into eight overlapping seg- 
ments that could be independently synthesized 
and tested. We previously used this approach 
to identify a single lethal point mutation in our 
synthetic syn1.0 construct (10). As before, each of 
the eight designed synthetic segments had a cor- 
responding syn1.0 DNA segment. This allowed 
untested pieces to be mixed and matched with 
viable syn1.0 pieces in one-pot combinatorial as- 
semblies or to be purposefully assembled in any 
specified combination (J6, 17). Additionally, each 
of the eight target segments was moved into a 
seven-eighths syn1.0 background by recombinase- 
mediated cassette exchange [(RMCE; (8)] (fig. S10) 
(9). Unique restriction sites (NotI sites) flanked 
each HMG or syn1.0 segment in the resulting yeast 
strains (fig. S9 and table $12) (9). Upon transplan- 
tation, we obtained a mycoplasma strain carrying 
any viable HMG segment (flanked by NotI sites) 
and eight other strains, each carrying one syn1.0 


segment (flanked by Not! sites). This facilitated 
the production of one-eighth genome segments, 
because they could be recovered from bacterial 
cultures, which produce much higher yields of 
better-quality DNA than yeast (9). All eight HMG 
segments were tested in a syn1.0 background, but 
only one of the segment designs produced viable 
colonies (HMG segment 2), and the cells grew 
poorly. 

Perhaps the greatest value that we derived 
from the HMG work was the refinement of semi- 
automated DNA synthesis methods by means 
of error-correcting procedures. We had previ- 
ously developed a variety of DNA synthesis and 
assembly methodologies that extend from oligo- 
nucleotides to whole chromosomes. In this work, 
we optimized the methodology to rapidly gener- 
ate error-free large DNA constructs, starting from 
overlapping oligonucleotides in a semiautomated 
DNA synthesis pipeline. This was accomplished 
by developing robust protocols for (i) single- 
reaction assembly of 1.4-kbp DNA fragments from 
overlapping oligonucleotides, (ii) elimination of 
synthesis errors and facilitation of single-round 
assembly and cloning of error-free 7-kbp cassettes, 
(iii) cassette sequence verification to simultaneously 
identify hundreds of error-free clones in a single 
run, and (iv) rolling circle amplification (RCA) of 
large plasmid DNA derived from yeast. Together, 
these methods substantially increased the rate at 
which the DBT cycle could be carried out (9). 

Figure 2 illustrates the general approach that 
we used for whole-genome synthesis and assem- 
bly, using HMG as an example. An automated 
genome synthesis protocol was established to 
generate overlapping oligonucleotide sequences, 
starting from a DNA sequence design (9). Briefly, 
the software parameters include the number of 
assembly stages, overlap length, maximum oli- 
gonucleotide size, and appended sequences to 
facilitate polymerase chain reaction (PCR) am- 
plification or cloning and hierarchical DNA as- 
sembly. About 48 oligonucleotides were pooled, 
assembled, and amplified to generate 1.4-kbp DNA 
fragments in a single reaction (figs. S12 and S13) 
(9). The 1.4-kbp DNA fragments were then error 
corrected, re-amplified, assembled five at a time 
into a vector, and transformed into E. coli. Error- 
free 7-kbp cassettes were identified on a DNA 
sequencer (Illumina MiSeq), and as many as 15 
cassettes were assembled in yeast to generate 
one-eighth molecules. Supercoiled plasmid DNA 
was prepared from positively screened yeast clones, 
and RCA was performed to generate microgram 
quantities of DNA for whole-genome assembly 
in yeast (figs. S14: to S16). This whole-genome syn- 
thesis workflow can be carried out in less than 
three weeks, which is about two orders of mag- 
nitude faster than the first reported synthesis of 
a bacterial genome (by our group) in 2008 (7). 


Tn5 mutagenesis identifies essential, 
quasi-essential, and nonessential genes 


It was clear from the limited success of our HMG 
design that we needed a better understanding of 
which genes are essential versus nonessential. To 
achieve this, we used Tn5 mutagenesis (fig. S1). 
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An initial Tnd disruption map was generated by 
transforming JCVI-syn1.0 ARE AIS cells [in which 
all restriction enzyme (RE) genes and six insertion 
CIS) elements are deleted; table S6] with an activated 
form of a 988-bp miniature-Tn5 puromycin resist- 
ance transposon (fig. S1) (9). Transformed cells 
were selected on agar plates containing 10 ug/ml 
of puromycin. About 80,000 colonies, each arising 
from a single TnJ5 insertion event, were pooled 
from the plates. A sample of DNA extracted from 
this “PO” pool was mechanically sheared and an- 
alyzed for the sites of Tn5 insertion using inverse 
PCR and DNA sequencing. The PO data set con- 
tained ~30,000 unique insertions. To remove 
slow-growing mutagenized cells, a sample of the 
pooled PO cells was serially passaged for more 
than 40 generations, and DNA was prepared and 
sequenced to generate a “P4” data set containing 
~14,000 insertions. (fig. S2). 

Genes fell into three major groups: (i) Genes 
that were not hit at all, or that were sparsely hit 
in the terminal 20% of the 3’-end or the first few 
bases of the 5’-end, were classified as essential 
(“e-genes”) (5). (ii) Genes that were hit frequently 
by both PO and P4 insertions were classified as 
nonessential (“n-genes”). (iii) Genes hit primarily 
by PO insertions but not P4 insertions were clas- 
sified as quasi-essential, the deletion of which 
would cause growth impairments (“i-genes”). Cells 
with i-gene disruptions spanned a continuum 
of growth impairment, varying from minimal to 
severe. To differentiate this growth continuum, 
we designated i-genes with minimal growth dis- 
advantage as “in-genes” and those with severe 
growth defect as “ie-genes.” Of the 901 annotated 
protein and RNA coding genes in the syn1.0 
genome, 432 were initially classified as n-, 240 as 
e-, and 229 as i-genes (Fig. 3, A and B, and fig S3). 

In viewing a syn1.0 map of P4 insertions (fig. 
S4), it was evident that nonessential genes tended 
to occur in clusters far more often than expected 
by chance. We used deletion analysis to confirm 
that most of the n-gene clusters could be deleted 
without losing viability or substantively affecting 
growth rate (9). Individual gene clusters (or, in 
some cases, single genes) were replaced by the 
yeast URA3 marker as follows. Fifty-base pair 
sequences flanking the gene(s) to be deleted 
were added to the ends of the URA3 marker by 
PCR, and the DNA was introduced into yeast 
cells carrying the synl1.0 genome. Yeast clones 
were selected on plates not containing uracil, 
confirmed by PCR, and transplanted to deter- 
mine viability. Deletions fell into three classes: 
(i) those resulting in no successful transplants, 
indicating deletion of an essential gene; (ii) 
those resulting in transplants with normal or 
near-normal growth rates, indicating deletion 
of nonessential genes; and (iii) those resulting 
in transplants with slow growth, indicating 
deletion of quasi-essential genes. 

A large number of deletions, including all of 
the HMG deletions, were individually tested for 
viability and yielded valuable information for 
subsequent reduced-genome designs. The trans- 
poson insertion data that were available at the time 
of the HMG design were all collected from passage 
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Fig. 2. Strategy for whole-genome synthesis. Overlapping oligonucleotides (oligos) were designed, 
chemically synthesized, and assembled into 1.4-kbp fragments (red). After error correction and PCR 
amplification, five fragments were assembled into 7-kbp cassettes (blue). Cassettes were sequence- 
verified and then assembled in yeast to generate one-eighth molecules (green). The eight molecules 
were amplified by RCA and then assembled in yeast to generate the complete genome (orange). 


PO. Consequently, genes with insertions included 
the genes that were subsequently characterized as 
quasi-essential i-genes, so some HMG deletions 
produced very small colonies or were nonviable. 
In parallel with iterations of the DBT cycle, 
described below, we also took the traditional 
sequential deletion approach to genome reduc- 
tion. We performed stepwise scarless deletions 
(fig. S8 and table S11) (9) of medium to large 
clusters to produce a series of strains with pro- 
gressively greater numbers of genes removed. 
Strain D22, with 255 genes and 357 kbp of DNA 
removed, grew at a rate similar to syn1.0 (table S6). 
We discontinued this approach when it became 
clear that the synthesis of redesigned segments 
at each DBT cycle would yield a minimal cell more 
quickly. These deletion studies also helped to 
validate our simple set of deletion rules. 


Retaining quasi-essential genes yields 
viable segments but no viable 
complete genome 


To improve on the design of the HMG, we re- 
designed a reduced genome using the Tnd and 
deletion data described above. This reduced ge- 
nome design (RGD1.0) achieved a 50% reduc- 
tion of syn1.0 by removing ~90% of the n-genes 
(table S1). In a few cases, n-genes were retained— 


specifically, if their biochemical function appeared 
essential, or if they were singlet n-genes separat- 
ing two large e- or i-gene clusters. To preserve the 
expression of genes upstream and downstream 
of deleted regions, we followed the same design 
rules that we used in the HMG design. 

The eight segments of RGD1.0 were chemi- 
cally synthesized as described above, and each 
synthetic reduced segment was inserted into a 
seven-eighths syn1.0 background in yeast by 
means of RMCE (fig. S10 and table S13) (9). Each 
one-eighth RGD-seven-eighths syn1.0 genome 
was then transplanted out of yeast to test for 
viability. Each of the eight reduced segments 
produced a viable transplant; however, segment 
6 produced only a very small colony in the first 
6 days. On further growth over the next 6 days, 
sectors of faster-growing cells developed (fig. S18). 
Several isolates of the faster-growing cells were 
sequenced and found to have destabilizing mu- 
tations in a transcription terminator that had 
been joined to an essential gene when the non- 
essential gene preceding it was deleted (figs. S19 
and $21). Another mutation produced a consen- 
sus TATAAT box in front of the essential gene 
(fig. S20). This illustrates the potential for ex- 
pression errors when genes are deleted, but it shows 
that these errors can sometimes be corrected by 
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Fig. 3. Classification of gene essentiality by transposon mutagenesis. (A) Examples of the three 
gene classifications, based on Tn5 mutagenesis data. The region of synl.0 from sequence coordinates 
166,735 to 170,077 is shown. The gene MMSYNI_0128 (lime arrow) has many PO Tn5 inserts (black 
triangles) and is an i-gene (quasi-essential). The next gene, MMSYN1_0129 (light blue arrow), has no 
inserts and is an e-gene (essential). The last gene, MMSYN1_0130 (gray arrow), has both PO (black 
triangles) and P4 (magenta triangles) inserts and is an n-gene (nonessential). Intergenic regions are 
indicated by black lines. (B) The number of syn1.0 genes in each Tn5-mutagenesis classification group. 
The n- and in-genes are candidates for deletion in reduced genome designs. 
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Fig. 4. The three DBT cycles involved in building syn3.0. This detailed map shows syn1.0 genes 
that were deleted or added back in the various DBT cycles leading from syn1.0 to syn2.0 and finally to 
syn3.0 (compare with fig. S7). The long brown arrows indicate the eight Notl assembly segments. 
Blue arrows represent genes that were retained throughout the process. Genes that were deleted in 
both syn2.0 and syn3.0 are shown in yellow. Green arrows (slightly offset) represent genes that were 
added back. The original RGD1.0 design was not viable, but a combination of synl.0 segments 1, 3, 4, 
and 5 and designed segments 2, 6, 7, and 8 produced a viable cell, referred to as RGD2678. Addition 
of the genes shown in green resulted in syn2.0, which has eight designed segments. Additional 
deletions, shown in magenta, produced syn3.0 (531,560 bp, 473 genes). The directions of the arrows 
correspond to the directions of transcription and translation. 


subsequent spontaneous mutation. Ultimately, 
we identified a promoter that had been over- 
looked and erringly deleted. When this region 
was resupplied in accordance with the design 
rules, cells containing designed segment 6 
grew rapidly. This solution was incorporated 
in later designs. 

Despite the growth of cells containing each de- 
signed segment, combining all eight reduced RGD1.0 
segments, including the self-corrected segment 
6, into a single genome did not produce a viable cell 
when transplanted into M. capricolum (9). We then 
mixed the eight RGD1.0 segments with the eight 
syn1.0 segments to perform combinatorial assem- 
bly of genomes in yeast (9). A number of completely 
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assembled genomes were obtained in yeast that 
contained various combinations of RGD1.0 segments 
and syn1.0 segments. When transplanted, several 
of these combinations gave rise to viable cells 
(table S7). One of these (RGD2678)—containing 
RGD10 segments 2, 6, 7, and 8 and syn1.0 seg- 
ments 1, 3, 4, and 5, with an acceptable growth 
rate (105-min doubling time, compared with 60 min 
for syn1.0)—was analyzed in more detail. 


To obtain a viable genome, avoid 
deleting pairs of redundant genes for 
essential functions 


In bacteria, it is common for certain essential 
(or quasi-essential) functions to be provided by 


more than one gene. The genes may or may not 
be paralogs. Suppose gene A and gene B each 
supplies the essential function, El. Either gene 
can be deleted without loss of El, so each gene 
by itself in a single knockout study is classed as 
nonessential. However, if both are deleted, the 
cell will be dead because E1 is no longer pro- 
vided. Such a lethal combination of mutations 
is called a “synthetic lethal pair” (79). Redundant 
genes for essential functions are common in 
bacterial genomes, although less so in genomes 
that have undergone extensive evolutionary 
reduction, such as the mycoplasmas. Our biggest 
design challenge has been synthetic lethal pairs 
in which gene A has been deleted from one seg- 
ment and gene B from another segment. Each 
segment is viable in the context of a seven-eighths 
syn1.0 background, but when combined, the re- 
sulting cell is nonviable, or, in the case of a shared 
quasi-essential function, grows more slowly. We 
do not know how many redundant genes for es- 
sential functions are present in each of the eight 
segments, but when RGD1.0 segments 2, 6,7, and 
8 were combined, the cell was viable. 

We subjected RGD2678 to Tn5 mutagenesis 
and found that some n-genes in the syn1.0 seg- 
ments 1, 3, 4, and 5 had converted to i- or e-genes 
in the genetic context of RGD2678 (table S2). 
This was presumably because these genes en- 
coded essential or quasi-essential functions that 
were redundant with a gene that had been de- 
leted in RGD2678. 

In addition, we examined 39 gene clusters and 
single genes that had been deleted in the de- 
sign of RGD1.0 segments 1, 3, 4, and 5 (table S8). 
These were deleted one at a time in an RGD2678 
background (tables S8 and S14) and tested for 
viability by transplantation. In several cases, this 
resulted in slow-growth transplants or no trans- 
plants, suggesting that they included one or more 
genes that are functionally redundant with genes 
that had been deleted in segments 2, 6, 7, or 8. 

The combined TnJ and deletion data identi- 
fied 26 genes (tables S2 and S9) as candidates 
for adding back to RGD1.0 segments 1, 3, 4, and 
5 to produce a new RGD2.0 design for these seg- 
ments (fig. S5 and tables S1 and S2). An assembly 
was carried out in yeast using the newly designed 
and synthesized RGD2.0 segments 1, 3, 4, and 5, 
together with RGD1.0 segments 2, 6, 7, and 8 
(tables S7 and S15). This assembly was not viable 
initially, but we found that substituting syn1.0 
segment 5 for RGD2.0 segment 5 resulted in a 
viable transplant. Working with this strain, we 
deleted a cluster of genes (VMSYNI_0454 to 
MMSYN1_0474) from syn1.0 segment 5 and re- 
placed another cluster of genes MMSYN1_0483 
to MMSYN1_0492) with gene MMSYN1_O154 (figs. 
S6 and S11 and table S10) (9). Gene MMSYN1_0154 
was originally deleted from segment 2 in the 
RGD1.0 design but was found to increase growth 
rate when added back to RGD2678. The described 
revision of synl1.0 segment 5 in the RGD2.0 ge- 
netic context yielded a viable cell, which we refer 
to as JCVI-syn2.0 (abbreviated syn2.0; Fig. 4). With 
syn2.0, we achieved for the first time a minimized 
cell with a genome smaller than that of the smallest 
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Fig. 5. Map of proteins in syn3.0 and homologs found in other organisms. Searches using BLASTP software were performed for all syn3.0 protein-coding 
genes against a panel of 14 organisms ranging from non-Mycoides mycoplasmas to Archaea. A score of le~° was used as the similarity cutoff. From left to right, five 
classes (equivalog, 232 genes; probable, 58 genes; putative, 34 genes; generic, 84 genes; and unknown, 65 genes) proceed from nearly complete certainty about a 
gene’s activity (equivalog) to no functional information (unknown). White space indicates no homologs to syn3.0 in that organism. 


known natural bacterium, M. genitalium. Syn2.0 
doubles in laboratory culture every 92 min. Its 
total genome size is 576 kbp. It contains 478 protein 
coding genes and 38 RNA genes from M. mycoides, 
with 12 kbp of vector sequences for selection of the 
genome and for propagation in yeast and E. coli. 


Removing 42 additional genes yields an 
approximately minimal cell, syn3.0 


We performed a new round of Tn5 mutagene- 
sis on syn2.0. In this new genetic background, 
transition of some i-genes to apparent n-genes 
was expected. At this point, the composition of 
the P4 serial passage population was depleted 
of original n-genes; the faster-growing i-gene 
knockouts predominated and were classified 
as n-genes by our rules. We classified 90 genes 
as apparently nonessential. These were sub- 
divided into three groups. The first group con- 
tained 26 genes that were frequently classed as i- or 
e-genes in previous rounds of mutagenesis. The 
second group contained 27 genes that were clas- 
sified as i- or borderline i-genes in some of the 
previous Tn5 studies. The third group contained 
37 genes that had previously been classified as 
nonessential in several iterations of Tnd muta- 
genesis involving various genome contexts. To 
create the new RGD3.0 design, these 37 genes 
were selected for deletion from syn2.0, along with 
two vector sequences, bla and lacZ, and the ri- 
bosomal RNA (rRNA) operon in segment 6 (Fig. 4 
and table S3). 

The eight newly designed RGD3.0 segments 
were synthesized and propagated as yeast plas- 
mids. These plasmids were amplified in vitro 
by RCA (9). All eight segments were then re- 
assembled in yeast to obtain several versions of 
the RGD3.0 genome as yeast plasmids (9). These 
assembled RGD3.0 genomes were transplanted 
out of yeast into an M. capricolum recipient cell. 
Several were viable. One of these, RGD3.0 clone 
g-19 (table S4), was selected for detailed analysis 
and named JCVI-syn3.0. 

A final round of Tn5 mutagenesis was per- 
formed on syn3.0 to determine which genes 
continue to show Tnd insertions after serial 
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Table 1. Syni1.0 genes listed by functional category and whether they were kept or deleted in 
syn3.0. Categories with asterisks are mostly kept in syn3.0, whereas those without are depleted in 
syn3.0. Vector sequences, for selection of the genome and for propagation in other hosts, are not included 
in these gene tallies. 


Functional category Kept Deleted 


Glucose transport and glycolysis* 15 0 


Total 473 428 


passaging (P4). Nonessential vector genes and | in genes that were originally classified as quasi- 
intergenic sequences are the most frequent in- essential make up almost the whole population 
sertion sites. As expected, cells with insertions of P4 cells that have insertions in mycoplasma 
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genes. The genes in syn3.0 are predominantly 
e- or i-genes, based on the original syn1.0 classi- 
fication. Of these, only the i-genes can tolerate 
Tn5 insertions without producing lethality. The 
most highly represented in-, i-, and ie-genes are 
shown in table S5. It might be possible to remove 
a few of these in a fourth DBT cycle, but there 
would probably be further erosion of growth 
rate. In addition, a dozen genes that were orig- 
inally classified as nonessential continue to re- 
tain that classification (table S5 and database S1). 


In syn3.0, 149 genes cannot be assigned 
a specific biological function 


Syn3.0 has 438 protein- and 35 RNA-coding genes. 
We organized the 4:73 genes into five classes, based 
on our level of confidence in their precise func- 
tions: equivalog, probable, putative, generic, and 
unknown (Fig. 5 and database S1). Many of these 
genes have been studied exhaustively, and their 
primary biological functions are known. 

We used the TIGRfam equivalog family of 
hidden Markov models (20) to annotate equiv- 
alog genes (~49% of the genes). The less certain 
classes were defined in a stepwise manner (Fig. 
5). The probable class included genes that scored 
well against unambiguous TIGRfam mathemat- 
ical models but that nevertheless scored below 
the trusted cutoff. These genes were consistently 
supported by other lines of evidence. Genomic 
context and threading alignment to crystal struc- 
tures both agreed with the assignment. The pu- 
tative class included genes that were similarly 
supported by multiple lines of evidence; at the 
same time, either their scores, genomic context, 
or alignment to structures with known activities 
were not convincing. The generic class included 
genes encoding clearly identifiable proteins (e.g., 
kinase) but lacking consistent clues as to their 
substrates or biological role. Unknown genes were 
those that could not be reliably categorized with 
regard to a putative activity. 

Thus, biological functions could not be as- 
signed for the ~31% of the genes that were 
placed in the generic and unknown classes. 
Nevertheless, potential homologs for a number 
of these were found in diverse organisms. Many 
of these genes probably encode universal pro- 
teins whose functions are yet to be character- 
ized. Each of the five sectors has homologs in 
species ranging from mycoplasma to humans. 
However, some of each annotation class is blank, 
indicating that no homologs for these genes were 
found among the organisms chosen for display 
in Fig. 5. Because mycoplasmas evolve rapidly, 
some of the white space in Fig. 5 corresponds to 
sequences that have diverged so as to align poor- 
ly with representatives from other organisms 

In Table 1, we have assigned syn1.0 genes to 
30 functional categories and indicated how many 
were kept or deleted in syn3.0. Of the 428 deleted 
genes, the largest group is the functionally un- 
assigned genes; 134 out of 213 have been deleted. 
All of the 73 mobile element and DNA modifica- 
tion and restriction genes have been removed, as 
well as most genes encoding lipoproteins (72 out 
of 87). These three categories alone account for 
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65% of the deleted genes. In addition, because of 
the rich growth medium that supplies almost all 
of the necessary small molecules, many genes in- 
volved in transport, catabolism, proteolysis, and 
other metabolic processes have become dispens- 
able. For example, because glucose is plentiful in 
the medium, most genes for transport and catab- 
olism of other carbon sources have been deleted 
(34 out of 36), whereas all 15 genes involved in glu- 
cose transport and glycolysis have been retained. 

In contrast, almost all of the genes involved in 
the machinery for reading and expressing the 
genetic information in the genome and in ensuring 
the preservation of genetic information across 
generations have been retained. The first of these 
two fundamental life processes, the expression 
of genetic information as proteins, requires the 
retention of 195 genes in the categories of tran- 
scription, regulation, RNA metabolism, transla- 
tion, protein folding, RNA (rRNA, tRNA, and small 
RNAs), ribosome biogenesis, rRNA modification, 
and tRNA modification. The second of these two 
fundamental processes, the preservation of genome 
sequence information, requires the retention of 
34 genes in the categories of DNA replication, 
DNA repair, DNA topology, DNA metabolism, chro- 
mosome segregation, and cell division. These two 
processes together require 229 (48%) of the 
473 total genes in syn3.0 (Fig. 6). 

In addition to the two vital processes just 
described, another major component of living 
cells is the cell membrane that separates the outer 
medium from the cytoplasm and governs molec- 
ular traffic into and out of the cell. It is an iso- 
latable structure, and many of the syn3.0 genes 
code for its protein constituents. Because our 
minimal cell is largely lacking in biosynthesis of 
amino acids, lipids, nucleotides, and vitamins, it 
depends on the rich medium to supply almost all 
of these required small molecules. This neces- 
sitates numerous transport systems within the 
membrane. In addition, the membrane is rich 
in lipoproteins. Membrane-related genes account 
for 84 (18%) of the 473 total syn3.0 genes. In- 
cluded categories from Table 1 are lipoproteins, 
cofactor transport, efflux systems, protein trans- 
port, and other membrane transport systems. 


Fig. 6. Partition of 


Lastly, 81 genes (17%) that are primarily involved 
in cytosolic metabolism are retained in the cat- 
egories of nucleotide salvage, lipid salvage and 
biogenesis, proteolysis, metabolic processes, redox 
homeostasis, transport and catabolism of non- 
glucose carbon sources, and glucose transport 
and glycolysis (Fig. 6). 

We presume that most of the 79 genes that 
are not assigned to a functional category be- 
long to one or another of these same four major 
groups (gene expression, genome preservation, 
membrane structure and function, and cytosolic 
metabolism). Among these 79 genes, 65 have com- 
pletely unknown functions and 24 are classified as 
generic, such as in the case of a hydrolase for 
which neither the substrate nor the biological role 
is discernible. The other 60 of the 84: genes in the 
generic class were assigned to a functional cat- 
egory on the basis of their generic classification. 
For example, an ABC transporter is assigned to 
membrane transport, even though the substrate 
is unknown. Some of the unassigned essential 
genes match domains of unknown function that 
have been found in a wide variety of organisms. 


Syn3.0 has a doubling time of 3 hours 
and is polymorphic in appearance 


Comparison of syn3.0 with the starting cell, 
syn1.0 (Fig. 7A) (9), showed that they have simi- 
lar colony morphologies, which are characteristic 
of the natural, wall-less Mycoplasma mycoides 
subspecies capri on which the synthetic syn1.0 
genome was originally based (10). The smaller 
colony size of syn3.0 suggested a slower growth 
rate and perhaps an altered colony architecture 
on the solid medium. A corresponding reduction 
in the growth rate of syn3.0 in static liquid cul- 
ture (Fig. 7B), from a doubling time of ~60 min 
for syn1.0 to ~180 min, confirmed the lower in- 
trinsic rate of propagation for syn3.0. This rate, 
however, greatly exceeds the 16-hour doubling 
time of M. genitalium (21). 

In contrast to the anticipated reduction in 
growth rate, we found unexpected changes in 
macro- and microscopic growth properties of 
syn3.0 cells. Whereas syn1.0 grew in static cul- 
ture as nonadherent planktonic suspensions of 


genes into four major Unassigned 

functional groups. 17% 

Syn3.0 has 473 genes. 

Of these, 79 have no Cytosolic Expression of 
assigned functional metabolism genome information 
category (Table 1). The 17% 41% 


remainder can be 
assigned to four major 
functional groups: 

(i) expression of 
genome information 
(195 genes); (ii) preser- 
vation of genome infor- 
mation (34 genes); (iii) 
cell membrane struc- 
ture and function (84 
genes); and (iv) cytoso- 
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lic metabolism (81 genes). The percentage of genes in each group is indicated. 
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predominantly single cells with a diameter of 
~400 nm (10), syn3.0 cells formed matted sedi- 
ments under the same conditions. Microscopic 
images of these undisturbed cells revealed ex- 
tensive networks of long, segmented filamen- 
tous structures, along with large vesicular bodies 
(Fig. 7C), which were particularly prevalent at late 
stages of growth. Both of these structures were 
easily disrupted by physical agitation, yet such sus- 
pensions contained small replicative forms that 
passed through 0.2-um filters to render colony- 
forming units (CFU). This same procedure retained 
99.9% of the CFU in planktonic syn1.0 cultures. 


Exploring the design of 

reorganized genomes 

To further refine our genome-design rules, we 
also investigated prospects for logically orga- 
nizing genomes and recoding them at the nu- 


cleotide level. This effort was meant to clarify 
whether gene order and gene sequence are 
major contributors to cell viability. The ability 
to invert large sections of DNA in many ge- 
nomes has demonstrated that overall, gene 
order is not critical. We showed that fine-scale 
gene order is also not a major factor in cell 
survival. About an eighth of the genome was 
reconfigured into seven contiguous DNA cas- 
settes, six of which represented known bio- 
logical systems; the seventh cassette contained 
genes whose system-level assignment was some- 
what equivocal. The vertical bar on the right 
side of Fig. 8 specifies the biological systems. 
Individual genes (colored horizontal lines) and 
intergenic regions (black lines) can be traced 
from their native location to their new positions 
by following a line from left to right. Intersecting 
lines represent a change in the relative position 
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of two genetic elements. Despite extensive re- 
organization, the resulting cell grew about as fast 
as synl.0, as judged by colony size. Thus, the 
details of genetic organization impinge upon 
survival in hypercompetitive natural environ- 
ments, but the finer details are apparently not 
critical for life. 


Recoding and rRNA gene replacement 
provide examples of genome plasticity 


Our DBT cycle for bacterial genomes allows us 
to assess the plasticity of gene content in terms 
of sequence and functionality. This includes 
testing modified versions of genes that are fun- 
damentally essential for life. We tested whether 
an altered 16S rRNA gene (77s), which is essen- 
tial, could support life (Fig. 9A). The single copy 
of the syn3.0 rvs gene was designed and synthe- 
sized to include seven single-nucleotide changes 


Fig. 7. Comparison of syn1.0 and syn3.0 growth features. (A) Cells derived 
from 0.2 um-filtered liquid cultures were diluted and plated on agar medium 
to compare colony size and morphology after 96 hours (scale bars, 1.0 mm). 
(B) Growth rates in liquid static culture were determined using a fluorescent 
measure (relative fluorescent units, RFU) of double-stranded DNA accu- 
mulation over time (minutes) to calculate doubling times (td). Coefficients 
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of determination (R*) are shown. (C) Native cell morphology in liquid culture 
was imaged in wet mount preparations by means of differential interference 
contrast microscopy (scale bars, 10 wm). Arrowheads indicate assorted forms 
of segmented filaments (white) or large vesicles (black). (D) Scanning electron 
microscopy of synl.0 and syn3.0 (scale bars, 1 um). The picture on the right 
shows a variety of the structures observed in syn3.0 cultures. 
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corresponding to those contained in the 77s gene 
of M. capricolum. In addition, we replaced helix 
h39 (35 nucleotides) with that from a phyloge- 
netically distant E. coli rrs counterpart. This 
unique 16S gene was successfully incorporated 
into syn3.0 without noticeably affecting growth 
rate. Some other variants of the 77s gene were 
constructed but proved nonviable. This demon- 
strates our ability to test the plasticity of a gene 
sequence and, at the same time, provides a water- 
mark by which to quickly identify this strain. 

We also tested the underlying codon usage 
principles in the M. mycoides genome, which 
has extremely high adenine and thymine (AT) 
content. M. mycoides uses TGA as a codon for 
the amino acid tryptophan, instead of a stop 
codon, and occasionally uses nonstandard start 
codons; in addition, the codon usage is heavily 
biased toward high-AT content. We modified 
this uncommon codon usage in a 5-kbp region 


containing three essential genes (era, recO, and 
glyS) to determine its role. Specifically, we mod- 
ified this region to include (i) M. mycoides codon 
adaptation index (CAD), but with the unusual 
start codons recoded and tryptophan encoded 
by the TGG codon, instead of by TGA; (ii) E. colt 
CAI, with tryptophan still encoded by TGA; or 
(iii) E. coli CAI with standard codon usage (TGG 
encoding tryptophan) (Fig. 9B). Unexpectedly, 
we found that all three versions were functional 
and resulted in M. mycoides cells without no- 
ticeable growth differences. However, large-scale 
changes in codon usage may need to accompany 
modifications in the tRNA dosage levels to en- 
sure efficient translation. 


Discussion and conclusions 


Genomics is moving from a descriptive phase, 
in which genomes are sequenced and analyzed, 
to a synthetic phase, in which whole genomes 
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Fig. 8. Reorganization of gene order in seg- 
ment 2. Genes involved in the same process 
were grouped together in the design for “mod- 


ularized” segment 2. At the far left, the gene order of 
synl.0 segment 2 is indicated. Genes deleted in syn3.0 
are indicated by faint gray lines. Retained genes are indi- 
cated by colored lines matching the functional categories to which 
they belong, which are shown on the right. Each line connects the 


position of the corresponding gene in syn1.0 with its position in the 
modularized segment 2. Black lines represent intergenic sequences containing promoters or transcrip- 


tional terminators. 
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can be built by chemical synthesis. As the de- 
tailed genetic requirements for life are discov- 
ered, it will become possible to design whole 
genomes from first principles, build them by 
chemical synthesis, and then bring them to life 
by installation into a receptive cellular environ- 
ment. We have applied this whole-genome de- 
sign and synthesis approach to the problem of 
minimizing a cellular genome. 

A minimal cell is usually defined as a cell in 
which all genes are essential. This definition is 
incomplete, because the genetic requirements 
for survival, and therefore the minimal genome 
size, depend on the environment in which the 
cell is grown. The work described here has been 
conducted in medium that supplies virtually all 
the small molecules required for life. A minimal 
genome determined under such permissive con- 
ditions should reveal a core set of environment- 
independent functions that are necessary and 
sufficient for life. Under less permissive condi- 
tions, we expect that additional genes will be 
required. 

There is a large body of literature concerning 
the minimal cell concept and minimal sets of 
essential genes in a number of organisms [for a 
review, see (22)]. Work in the area has focused 
on comparative genomic analyses and on expe- 
riments in which genes are individually knocked 
out or disrupted by transposon insertion. Such 
studies identify a core of essential genes, often 
about 250 in number. But this is not a set of 
genes that is sufficient to constitute a viable 
cellular genome, because redundant genes for 
essential functions are scored as nonessential 
in these studies. 

In contrast, we set out to construct a mini- 
mal cellular genome in order to experimentally 
determine a core set of genes for an indepen- 
dently replicating cell. We designed a genome 
using genes from M. mycoides JCVI-syn1.0 (10). 
This mycoplasma cell has several advantages 
for this purpose. First, the mycoplasmas already 
have very small genomes. They have evolved 
from gram-positive bacteria with larger genomes 
by losing genes that are unnecessary in their 
niche as mammalian parasites. They are already 
far along an evolutionary pathway to a minimal 
genome, and consequently they are likely to have 
fewer functionally redundant genes than other 
bacteria. We also have a highly developed set of 
tools for building this genome and for assembling 
and manipulating the genome as an extra chro- 
mosome in yeast. 

Our initial attempt to design a minimal genome 
was based on the current collective knowledge 
of molecular biology, in combination with lim- 
ited data concerning transposon disruption of 
genes, which provided additional information 
about gene essentiality. This information was 
particularly valuable with respect to the genes 
of unknown function. Specific experimental pro- 
posals for minimal genome construction have 
been made solely on the basis of accumulated 
knowledge concerning the genes that are in- 
volved in fundamental biological processes (14). 
Our HMG was assembled from eight segments 
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and proved nonviable, although one of the seg- 
ments (segment 2) was functional when tested in 
the context of the other seven syn1.0 segments. 
These results convinced us that initially, we did 
not have sufficient knowledge to design a func- 
tional minimal genome from first principles. There- 


fore, to obtain better information concerning 
gene essentiality, we made major improvements 
in our transposon mutagenesis methods. 

To produce a genome containing all of the 
essential and quasi-essential genes, we devel- 
oped a DBT cycle for bacterial genomes (Fig. 1). 


Any design, viable or not, can be built in yeast 
and then tested to determine whether it can 
function as the genome of a viable bacterium. 
After four DBT cycles (genome designs HMG, 
RGD1.0, RGD2.0, and RGD3.0), we obtained a 
viable genome with all eight segments reduced, 
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E. coli CAl+ TGG_Trp 


M. mycoides CAI (wildtype) 22.5% 172% 25.6% 
M. mycoides CAI + TGG_Trp 22.7% 17.2% 26.3% 
E. coli CAl+ TGA_Trp 43.5% 42% 46.2% 

43.7% 42% 46.9% 


Fig. 9. Gene content and codon usage principles, tested using the 
DBT cycle. (A) Secondary structure of the modified rrs gene that was 
successfully incorporated into the syn3.0 genome; this gene was car- 
rying M. capricolum mutations and had its h39 (inset) swapped with that 
of E. coli. Positions with nucleotide changes are indicated by red arrows, 
and E. coli numbering is used to indicate the position of M. capricolum 
mutations. (B) The sequences of the essential genes era, recO, and glyS 


were modified in three different ways: using M. mycoides CAI with TGG 
encoding tryptophan, E. coli CAl with TGG encoding tryptophan, or E. coli 
CAI with TGA encoding tryptophan. GC content of the wild-type and 
modified genes is noted. The JCat codon adaptation tool was used for this 
exercise (www.jcat.de) to optimize the three open reading frames, with the 
exception of the overlapping gene fragment. Green and purple indicate 
wild-type and codon-optimized sequences, respectively. 
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Table 2. Reduced genomes resulting from the DBT cycles, ultimately leading to syn3.0. Column 1 indicates the round of genome design (dashes indicate the 
starting genome, synl.0), column 2 gives the size of the designed genome (in kilobase pairs), and column 3 gives the number of mycoplasma genes in the design. 
Column 4 shows the genome composition for key viable cell strains; for nonviable designs, a viable strain with the highest number of segments from the design is 
shown, as well as a more robust alternative for RGD1.0 (fourth row) and a smaller derivative for RGD2.0 (sixth row, syn2.0). Column 5 gives the size of the genome 
corresponding to column 3, and column 6 shows a quantitative or qualitative estimate of the growth rate of cells with the genome described in column 4. 


1.Genome 2. Design 3. Number 4. Cellular genome 5. Cellular 6. Growth 
design size of design segment composition genome size rate 
genes for key viable strains 
ee eS ee ere a Rl A el ec FO synl.0: all eight synL.O Segments ssunnnnene 1079 kbp Doubling time, td = 60 min 
HMG 483 kbp HMG segment 2 + 7/8 syn1.0 1003 kbp Slow-growing 


RGDLO segments 1,2,4,6,8 + synl.0 segments 3,5,7 718 kbp Slow-growing 


syn2.0: RGD2.0 segments 1,2,3,4,6,7,8 + syn1.0 segment 


5 with genes MMSYNI_0454 to -0474 and MMSYNI_0483 to -0492 deleted >”© KOP tse min 


RGD3.0 531 kbp syn3.0: all eight segments of RGD3.0 531 kbp td = 180 min 
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syn3.0. Table 2 summarizes the process leading 
to syn3.0. The first three designs did not yield 
complete viable cellular genomes. But in each 
case, one or more of the eight segments yielded a 
viable genome when combined with syn1.0 seg- 
ments for the remainder of the genome. The 
composition of several of these intermediate 
strains is listed in the table. The viable syn3.0 
cell is our best approximation to a minimal 
cell. We obtained another strain with a genome 
smaller than any free-living cell found in nature 
(syn2.0). This cell has a genome consisting of 
seven RGD2.0 segments plus a syn1.0 segment 5 
with 31 genes deleted. 

Syn3.0 has a 531-kbp genome that encodes 4:73 
gene products. It is substantially smaller than 
M. genitalium (580 kbp), which has the smallest 
genome of any naturally occurring cell that has 
been grown in pure culture. The syn3.0 genome 
contains the core set of genes that are required 
for cellular life, but it is only half the size of syn1.0 
(10). In comparing the HMG to the subsequently 
derived viable syn3.0 genome, we found agree- 
ment among 329 deleted genes and 365 retained 
genes. However, 111 genes that were kept in 
syn3.0 were deleted from HMG, and 100 genes 
deleted in syn3.0 were kept in HMG. The dis- 
crepancies were primarily due to the sparseness 
and quality of the initial transposon data, which 
resulted in incorrect identification of a number 
of essential or nonessential genes and did not 
identify quasi-essential genes that affect growth 
rate (discussed below). One example of the im- 
portance of the quasi-essential gene classifica- 
tion was the case of four genes (MMSYN_0008 to 
MMSYN_OOI1) that make up the transport sys- 
tem for nucleosides. The original annotation of 
the system was as a ribose/galactose ABC trans- 
porter, which led us to target it for deletion in the 
HMG. Our initial transposon data showed that 
all four genes were hit heavily and appeared to 
confirm that the genes were nonessential. How- 
ever, in later transposon mapping experiments, 
PO transposon data confirmed that they were hit 
heavily, but after serial passage to deplete slow- 
growing cells, all four genes received zero hits, 
confirming that they were quasi-essential and 
should have been retained. 


Gene content of syn3.0 


Syn3.0 is a working approximation of a minimal 
cell. Our first synthetic cell, syn1.0, contained 
901 mycoplasma genes plus some watermarks and 
vector sequences. Of these, 428 have been re- 
moved in syn3.0, leaving 438 protein-coding genes 
and 35 RNA genes. More genes could probably 
be removed while retaining viability, but it seems 
likely that growth rate would be compromised. 
The slower growth rate of syn3.0 is not due to the 
removal of one of the rRNA operons. We also con- 
structed a strain with the same gene complement, 
except that it retained both rRNA operons, and 
this strain grew at close to the same rate as syn3.0. 

The largest group of genes retained in syn3.0 
is involved in gene expression (195 genes, 41%). 
Approximately equal numbers of genes are in- 
volved in the cell membrane (84 genes, 18%) and 
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in metabolism (81 genes, 17%). During reductive 
evolution as a mycoplasma, many biosynthetic 
genes were lost and replaced by transporters re- 
siding in the membrane, resulting in a trade-off 
between these two categories. A relatively small 
number of genes function in the replication of 
the genome and the preservation of genomic in- 
formation through cell division (36 genes, 7%). 
Unexpectedly, there are 79 genes (17%) that we 
have been unable to assign to a functional cat- 
egory. Of these, 19 are in the essential category 
(e-genes), 36 are needed for rapid growth (i- or 
je-genes), and 24 are nonessential or nearly so (n- 
or in-genes). We presume that most of these will 
fall into one of the four major categories described 
above (gene expression, membrane structure and 
function, cytosolic metabolism, and genome pres- 
ervation), but it seems likely that some of them 
may perform previously undescribed biological 
functions. In particular, 13 of the 19 functionally 
unassigned e-genes are of completely unknown 
function. Some of these match genes of unknown 
function in other bacteria or even in eukaryotes, 
and these are prime candidates for proteins with 
novel functions. Genes of unknown function that 
are required in syn3.0 and present in most orga- 
nisms must represent nearly universal functions 
and thus can provide biological insights. Like- 
wise, unknown genes without homologs may be 
novel, or they may represent unusual sequences 
but well-understood functions. In contrast to the 
wholly unknown genes, it is easy to oversimplify 
a gene’s putative role in cell survival if it has a 
generic functional assignment. For example, 
some of the numerous hydrolases and kinases 
will undoubtedly contribute to processes such 
as nucleotide or cofactor salvage. The question is, 
will all of the generic functions of the unknown 
genes be so commonplace, or do some of them 
represent fundamentally new processes? There 
are genes whose generic annotations are per- 
plexing even though they are needed for survival. 
For example, there are six different efflux systems, 
encoded by the genes MMSYN1_0034, MMSYNI_0371 
and MMSYN1_0372, MMSYNI_0399, MMSYN1_0531, 
MMSYNLZ_0639, and MMSYN1_0691. Except for the 
heterodimeric pair, MMSYNI_0371 and MMSYN1_0372, 
which may be a flippase, the substrates and func- 
tions of these proteins are unclear. It is some- 
what disconcerting to imagine that all of these 
exclude or remove toxic substances. Similarly, a 
rather complex pathway (23) for producing and 
exporting glycoglycerolipids was required. Although 
there is some evidence that galactofuranose res- 
idues are important for membrane integrity (24), 
a detailed explanation for the biological role 
fulfilled by glycoglycerolipids remains obscure. 


Phenotype of the syn3.0 cell 


The replication of genomic information and its co- 
ordinated distribution into segregated membrane- 
bound cellular compartments are hallmarks of 
extant living systems that are commonly con- 
sidered to be among the attributes that define 
cellular life (25). The minimal requirements for 
this process are not known, but evidence from 
disparate fields of study suggests that mecha- 


nisms far simpler than the complex division ap- 
paratus in most eubacteria may be sufficient. 
First, several types of bacterial cells, both 
natural (26) and experimentally manipulated 
(27, 28), have been shown to divide in the absence 
of key cytoskeletal components, most notably the 
FtsZ cytoskeletal scaffolding and force-generating 
component. Through our empirically based de- 
sign process, a nonessential gene cluster present in 
syn.0 (MMYSYN1_0520 through MMYSYN1_0522) 
was removed during construction of syn3.0 cells. 
This contained orthologs of fisZ and sepF [encod- 
ing a membrane-anchoring component that inter- 
acts with FtsZ (29)]. An adjacent gene, fisA, which 
is reported to share some redundant functions with 
sepF in other systems (30), remained essential in 
progressively reduced constructs that lacked fisZ. 
Second, completely synthetic lipid vesicles 
have been shown to spontaneously segregate 
without the involvement of macromolecular 
scaffolding or catalysis (31). In propagating cell 
wall-deficient bacteria, alteration of the lipid 
content and properties of the plasma membrane 
have been shown to elicit analogous membrane 
vesicle segregation (32). In several wall-less my- 
coplasma species, filamentous and large-vesicle 
morphotypes similar to those in syn3.0 have long 
been observed under certain growth conditions, 
depending in part on the nature of lipid precur- 
sors available to these cells (33). Ultimately, 
understanding the genetic and mechanistic 
basis for the phenotype of syn3.0 propagation 
may shed light on the minimal requirements for 
segregation of the membrane-bounded cellular 
compartment that is essential for a living cell. 


Use of the DBT cycle for applications 
other than genome minimization 


Our main focus here has been the application 
of the whole-genome DBT cycle to a specific 
problem, the construction of a minimal cellular 
genome. However, the approach we describe can 
be applied to the construction of a cell with any 
desired properties. For example, a cell could be 
designed with added metabolic pathways (34), 
an altered genetic code (35), or dramatically al- 
tered gene arrangements. We have begun to de- 
sign genomes with modified 16S rRNA sequences 
and to assess the effects of dramatic alterations 
in codon usage. Application of our DBT cycle is 
limited only by our ability to produce designs with 
a reasonable chance of success. With increasing 
knowledge of the functions of essential genes that 
are presently unknown, and with increasing ex- 
perience in reorganizing the genome, we expect 
that our design capabilities will strengthen. The 
ability to design cells in which the function of 
every gene is known should facilitate complete 
computational modeling of the cell (36). This would 
make it possible to calculate the consequences of 
adding pathways for the production of useful prod- 
ucts, such as drugs or industrial chemicals, and 
would lead to greater efficiency in development. 


Methods summary 


Our methods for the identification of nonessential 
genes by global Tn5 mutagenesis, manipulation 
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of bacterial genomes in yeast by the scarless 
TREC (tandem repeat coupled with endonuclease 
cleavage) deletion method, synthesis and assembly 
of reduced genomes, genome transplantation, mi- 
croscopic analysis of cells with reduced genomes, 
and observation of their growth characteristics are 
described in detail in the supplementary materials. 
General information about our methods, accom- 
panied by specific references to the supplementary 
materials, is included throughout the text. 
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INTRODUCTION: The reproducibility of results 
is one of the underlying principles of science. An 
observation can only be accepted by the scientific 
community when it can be confirmed by inde- 
pendent studies. However, reproducibility does 
not come easily. Recent works have painfully 
exposed cases where previous conclusions were 
not upheld. The scrutiny of the scientific com- 
munity has also turned to research involving 
computer programs, finding that reproducibil- 
ity depends more strongly on implementation 
than commonly thought. These problems are 
especially relevant for property predictions of 
crystals and molecules, which hinge on precise 
computer implementations of the governing 
equation of quantum physics. 


RATIONALE: This work focuses on density func- 
tional theory (DFT), a particularly popular quan- 


tum method for both academic and industrial 
applications. More than 15,000 DFT papers are 
published each year, and DFT is now increas- 
ingly used in an automated fashion to build 
large databases or apply multiscale techniques 
with limited human supervision. Therefore, the 
reproducibility of DFT results underlies the 
scientific credibility of a substantial fraction of 
current work in the natural and engineering 
sciences. A plethora of DFT computer codes 
are available, many of them differing consid- 
erably in their details of implementation, and 
each yielding a certain “precision” relative to 
other codes. How is one to decide for more than 
a few simple cases which code predicts the cor- 
rect result, and which does not? We devised a 
procedure to assess the precision of DFT meth- 
ods and used this to demonstrate reproduci- 
bility among many of the most widely used 


DFT codes. The essential part of this assessment 
is a pairwise comparison of a wide range of 
methods with respect to their predictions of the 
equations of state of the elemental crystals. This 
effort required the combined expertise of a large 
group of code developers and expert users. 


RESULTS: We calculated equation-of-state data 
for four classes of DFT implementations, total- 
ing 40 methods. Most codes agree very well, 
with pairwise differences that are comparable 
to those between different high-precision exper- 

iments. Even in the case of 
pseudization approaches, 
Read the full article which largely depend on 


at http://dx.doi. the atomic potentials used, 
org/10.1126/ a similar precision can be 
science.aad3000 obtained as when using the 


full potential. The remain- 
ing deviations are due to subtle effects, such as 
specific numerical implementations or the treat- 
ment of relativistic terms. 


CONCLUSION: Our work demonstrates that 
the precision of DFT implementations can be 
determined, even in the absence of one absolute 
reference code. Although this was not the case 5 
to 10 years ago, most of the commonly used codes 
and methods are now found to predict essen- 
tially identical results. The established precision 
of DFT codes not only ensures the reproducibility 
of DFT predictions but also puts several past and 
future developments on a firmer footing. Any 
newly developed methodology can now be tested 
against the benchmark to verify whether it 
reaches the same level of precision. New DFT ap- 
plications can be shown to have used a suffi- 
ciently precise method. Moreover, high-precision 
DFT calculations are essential for developing im- 
provements to DFT methodology, such as new 
density functionals, which may further increase 
the predictive power of the simulations. = 
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Recent DFT methods yield reproducible results. Whereas older DFT implementations predict different values (red darts), codes have now evolved to 


mutual agreement (green darts). The scoreboard 


illustrates the good pairwise agreement of four classes of DFT implementations (horizontal direction) 


with all-electron results (vertical direction). Each number reflects the average difference between the equations of state for a given pair of methods, with 
the green-to-red color scheme showing the range from the best to the poorest agreement. 
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Reproducibility in density functional 
theory calculations of solids 
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The widespread popularity of density functional theory has given rise to an extensive range 


of dedicated codes for predicting molecular and crystalline properties. However, each 


code implements the formalism in a different way, raising questions about the reproducibility 


of such predictions. We report the results of a community-wide effort that compared 


15 solid-state codes, using 40 different potentials or basis set types, to assess the quality 


of the Perdew-Burke-Ernzerhof equations of state for 71 elemental crystals. We conclude 
that predictions from recent codes and pseudopotentials agree very well, with pairwise 
differences that are comparable to those between different high-precision experiments. 


Older methods, however, have less precise agreement. Our benchmark provides a 
framework for users and developers to document the precision of new applications and 


methodological improvements. 


cientific results are expected to be repro- 

ducible. When the same study is repeated 

independently, it should reach the same 

conclusions. Nevertheless, some recent ar- 

ticles have shown that reproducibility is 
not self-evident. A widely resounding Science 
article (1), for example, demonstrated a lack of 
reproducibility among published psychology ex- 
periments. Although the hard sciences are believed 
to perform better in this respect, concerns about 
reproducibility have emerged in these fields as 
well (2-4). The issue is of particular interest when 
computer programs are involved. Undocumented 
approximations or undetected bugs can lead to 
wrong conclusions (5). In areas where academic 
codes compete with commercial software, the un- 
availability of source code can hinder assessment 
of the relevance of conclusions (6, 7). 

Density functional theory (DFT) calculations 
(8, 9) are a prominent example of an area that 
depends on the development and appropriate 
use of complex software. With rigorous founda- 
tions in the quantum theory of matter, DFT 
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describes the structure and properties of mole- 
cules and solids at the atomic scale. Over the years, 
many academic groups have developed imple- 
mentations of DFT in computer codes, and several 
of these have been adopted by large user commu- 
nities. Commercial alternatives are entering this 
area as well. At present, more than 15,000 papers 
are published each year that make use of DFT 
codes (J0), with applications varying from metal- 
lurgy to drug design. Moreover, DFT calculations 
are used nowadays to build large databases (11, 12) 
and in multiscale calculations in which they serve 
as one part of the tool chain (13, 14). The precision 
of DFT codes thus underlies the scientific credi- 
bility and reproducibility of a substantial fraction 
of current work in the natural and engineering sci- 
ences, and therefore it has implications that reach 
far beyond the traditional electronic-structure 
research community. 

The main idea of DFT is to solve the intrac- 
table many-particle Schrédinger equation by re- 
placing the complete electron wave function with 
the much simpler ground-state electron density as 


the fundamental variable. Although this refor- 
mulation is in principle exact, it is not fully known 
how the interaction between individual electrons 
should be transformed. As a result, the specific form 
of the unknown part of the interaction energy, the 
exchange-correlation functional, has been the focus 
of many investigations, leading to a plethora of 
available functionals in both solid-state physics 
(15-19) and quantum chemistry (15, 20-23). 

Once a particular exchange-correlation function- 
al has been chosen, the mathematical problem is 
completely specified as a set of Kohn-Sham equa- 
tions, whose solution yields orbitals and energies 
from which the total electronic energy can be 
evaluated. A variety of such numerical solution 
schemes have been implemented in different com- 
puter codes. Comparisons of their performance 
are much less frequent or extensive than those 
of exchange-correlation functionals, however 
(21, 24-29). One might reasonably expect that 
because they solve the same equations, they all 
produce similar answers for a given crystal 
structure, but a glance at the literature shows 
that this assumption is by no means always true. 
Figure 1 demonstrates that even for a well-studied 
material such as silicon, deviations between pre- 
dictions from different codes (the “precision”) are 
of the same order of magnitude as the deviation 
from the O K experimental value (the “accuracy”) 
(26, 30). Because all of the codes shown in Fig. 
1 treat silicon at the same level of theory, using 
the same exchange-correlation functional, they 
yield the same accuracy by definition. However, 
the particular predictions vary from one code to 
another because of approximations that are un- 
related to the exchange-correlation functional. 
These approximations decrease the computational 
load but limit the precision. 

What level of precision can we now achieve? 
Discussion of precision-related issues is uncom- 
mon in reports of solid-state DFT studies. The 
reproducibility of predictions is sometimes checked 
by cross-validation with other codes (2/, 24-28), 
but we are not aware of any systematic assessments 
of precision (also called “verification”), even though 
such studies would reinforce confidence in prac- 
tical DFT calculations. 

As a group of 69 code developers and expert 
users, we determined the error bar associated 
with energy-versus-volume [E(V)] predictions of 
elemental solids by running the same benchmark 
protocol with various DFT codes. Parameters of 
these equations of state (EOS), such as the lattice 
parameter or the bulk modulus, are commonly 
used for accuracy assessments (15-19). By consid- 
ering elemental solids, we have established a 
broad and comprehensive test for precision. Ele- 
mental solids have a wide range of chemical 
environments and constitute a reasonable first 
approximation to sampling the broad compo- 
sitional space of multicomponent systems. Our 
effort has resulted in 18,602 DFT calculations, 
which we aimed to execute with a rigorously de- 
termined precision. This exercise might seem 
simple, but each code tackles the Kohn-Sham 
equations and subsequent energy evaluation in 


its own way, requiring different approaches to 
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deal with difficulties in different parts of the 
computational procedure. 


Kohn-Sham solution techniques 


The Kohn-Sham equations describe a many- 
electron system in terms of a density built from 
single-particle wave functions. By expressing these 
wave functions as a linear combination of pre- 
defined basis functions, the Kohn-Sham equations 
reduce to a matrix equation, which can in prin- 
ciple be solved exactly. Their solutions should 
yield identical results, irrespective of the form 
of the basis functions, provided that the basis 
set is complete. However, achieving technical 
convergence of the complete Kohn-Sham prob- 
lem is not feasible in practice. Consider silicon, 
whose electronic structure is schematically illus- 
trated in Fig. 2. The Aufbau principle requires 
first populating the lowest energy level, which 
is the 1s band. This is much lower in energy than 
the valence and conduction bands, and the locali- 
zation of the orbitals close to the nuclei demands 
high spatial resolution. These core electrons do 
not contribute directly to chemical bonding, so 
they can be separated out and represented using 
a different basis that is better suited to describe 
localized atomic-like states. Core orbitals may be 
either computed in an isolated atom environ- 
ment, with their effect on valence transferred 
unaltered to the crystal, or relaxed self-consistently 
in the full crystal field. They can moreover be 
treated using a relativistic Hamiltonian, which is 
essential for core electrons in heavy atoms. Dif- 
ferent relativistic schemes may lead to differences 
in the predicted E(V) curves. 

To stitch together a complete solution, the wave 
functions of the semi-core and valence electrons 
(2s 2p and 3s 3p, respectively, in the case of sili- 
con) must be constructed to include the effect of 
orthogonality to the core electrons. This central 
problem can be solved in a number of different 


ways, depending on the choice of numerical meth- 
od. For methods that are based on plane-wave 
expansions or uniform real-space grids, the os- 
cillatory behavior near the nucleus cannot be 
accurately represented because of the limited 
spatial resolution. The need for unmanageably 
large basis sets can be mitigated by adding a care- 
fully designed repulsive part to the Kohn-Sham 
potential, a so-called pseudopotential. This pseudo- 
potential affects only a small region around the 
nuclei (gray zones in Fig. 2) and may conserve 
the core-region charge [norm-conserving pseu- 
dopotentials (37, 32)], giving rise to an analyti- 
cally straightforward formalism, or it may break 
norm conservation by including a compensating 
augmentation charge [ultrasoft pseudopotentials 
(33)], allowing for smoother wave functions and 
hence smaller basis sets. Alternatively, the projector- 
augmented wave (PAW) approach defines an 
explicit transformation between the all-electron 
and pseudopotential wave functions by means of 
additional partial-wave basis functions (34, 35). 
This allows PAW codes to obtain good precision 
for small numbers of plane waves or large grid 
spacings, but choosing suitable partial-wave projec- 
tors is not trivial. Here we refer to both pseudo- 
potential and PAW methods as pseudization 
approaches. In contrast to these approaches, all- 
electron methods explicitly construct basis func- 
tions that are restricted to a specific energy range 
[linearized augmented plane wave (LAPW) (36-39) 
and linear muffin-tin orbital (LMTO) (40) methods] 
or treat core and valence states on equal footing 
(e.g., by using numerical atomic-like orbitals) 
(41, 42). Avoiding pseudization enables better 
precision but inevitably increases the computation 
time. In these codes, the ambiguity in solving the 
Kohn-Sham problem shifts from the choice of the 
pseudization scheme to the choice of the basis 
functions. This choice leads to a variety of methods 
as well, which, despite solving the same Kohn-Sham 


equations, differ in many other details. Because 
each all-electron or pseudization method has its 
own fundamental advantages, it is highly desir- 
able to achieve high precision for all of them. 


The A matrix 


The case study for silicon (Fig. 1) demonstrates that 
different approaches to the potential or basis 
functions may lead to noticeably different predic- 
tions, even for straightforward properties such as 
the lattice parameter. There is no absolute refer- 
ence against which to compare these methods; 
each approach has its own intricacies and approx- 
imations. To determine whether the same re- 
sults can be obtained irrespectively of the code or 
(pseudo)potential, we instead present a large-scale 
pairwise code comparison using the A gauge. This 
criterion was formulated by Lejaeghere et al. (26) 
to quantify differences between DFT-predicted 
E(V) profiles in an unequivocal way. That study 
proposed a benchmark set of 71 elemental crystals 
and defined, for every element 7, the quantity A; as 
the root-mean-square difference between the EOS 
of methods a and 6 over a +6% interval around 
the equilibrium volume Vo,;. The calculated EOS 
are lined up with respect to their minimum en- 
ergy and compared in an interval that is sym- 
metrical around the average equilibrium volume 
(Fig. 3). 


1.060 : 
(2s,(V) - Fyi(V)) dv 


0.12V9; 


Ai(a, b) =! 


(1) 


A comparison of A; values allows the expres- 
sion of EOS differences as a single number, and 
a small A; automatically implies small devia- 
tions between equilibrium volumes, bulk moduli, 
or any other EOS-derived observables. The over- 
all difference A between methods a and 0 is 
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obtained by averaging A; over all 71 crystals in 
the benchmark set. Alternative definitions of A 
essentially render the same information (27, 28). 
In this work, we applied the original A protocol 
to 40 DFT implementations of the Perdew-Burke- 
Ernzerhof (PBE) functional (43). Appropriate 
numerical settings were determined separately 
for each method, ensuring converged results. In 
all calculations, valence and semi-core electrons 
were treated on a scalar-relativistic level, be- 
cause not all codes support spin-orbit coupling. 
This is not a limitation, because the aim is to 
compare codes with each other rather than to 
experiment. We do not elaborate here on speed 
and memory requirements, for which we refer to 
the documentation of the respective codes. 

Figure 4 presents an overview of the most im- 
portant A values, categorized by method: all- 
electron, PAW, ultrasoft pseudopotentials, and 
norm-conserving pseudopotentials. Approaches 
with a similar intrinsic precision are clustered 
together in this way. Both the full results and 
the most important numerical settings are in- 
cluded in tables S3 to S42. A complete specifica- 
tion would have to include code defaults and 
hard-coded values, so a reasonable compromise 
was chosen. A full specification could be re- 
alized by recent endeavors in full-output data- 
bases (44, 45) or workflow scripting (46, 47), 
but this capacity is not yet available for several 
of the codes used in this study. We have, how- 
ever, tried to provide generation scripts for as 
many methods as possible (48), and we empha- 
size the need for such tools as an important fu- 
ture direction. 


Comparing all-electron methods 


Although the definition of A does not favor a 
particular reference, it is instructive to first ex- 
amine the A values with respect to all-electron 
methods (Fig. 4). They generally come at a com- 
putationally higher cost, but all-electron ap- 
proaches are often considered to be a standard 
for DFT calculations, because they implement 
the potential without pseudization. By com- 
paring pseudopotential or PAW methods with 
all-electron codes, we can therefore get an idea 
of the error bar associated with each pseudiza- 
tion scheme. The A values between different 
all-electron methods reflect the remaining dis- 
crepancies, such as a different treatment of the 
scalar-relativistic terms or small differences in 
numerical methods. 

To gain some insight into typical values of A, 
we should first establish which values for A can 
be qualified as “small,” so that we know which 
results can be considered equivalent. A first in- 
dication comes from converting differences 
between high-precision measurements of EOS 
parameters into a A format. Comparing the high- 
quality experimental data of Holzapfel et al. for 
Cu, Ag, and Au (49) with those of Kittel (50) and 
Knittle (67), for example, shows a small difference 
Aexp Of 1.0 meV per atom. Because the average all- 
electron A for these materials is only 0.8 meV per 
atom, this implies that the precision of many DFT 
codes outperforms experimental precision. 
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Secondly, we also considered the differences 
between codes in terms of commonly reported 
EOS parameters. The 1.0 meV-per-atom maximum 
A among all-electron codes (Fig. 4, top) corresponds 
to an average volume deviation of 0.14 A? per atom 
(0.38%) or a median deviation of 0.05 A? per atom 
(0.24%) over the entire 7l-element test set. For 
the bulk modulus, the average deviation is 1.6 GPa 
(4.0%), and the median deviation 0.8 GPa (1.6%). 


Compared with the scatter on experimental values, 
which can amount to up to 35% for the bulk 
moduli of the rare earth metals [for instance, 
see (52)], these values are very small. The differ- 
ence between EOS obtained by independent all- 
electron codes is hence smaller than the spread 
between independent experimental EOS. We con- 
clude that, unless some elements deviate sub- 
stantially from the overall trend, codes with a 
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Fig. 1. Historical evolution of the predicted equilibrium lattice parameter for silicon. All data points 
represent calculations within the DFT-PBE framework. Values from literature (data points before 
2016) (15, 16, 18, 56-62, 63-65) are compared with (i) predictions from the different codes used in 
this study (2016 data points, magnified in the inset; open circles indicate data produced by older 
methods or calculations with lower numerical settings) and (ii) the experimental value, extrapolated 
to O K and corrected for zero-point effects (red line) (26). The concepts of precision and accuracy 


are illustrated graphically. 
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ps solid (green line), 


because the wave 
functions overlap from 
one atom to the next. 
The lowest-energy ls 
state (red) is at an 
energy two orders of 
magnitude lower than 
the valence states and is 
strongly localized near 
the nucleus, with no 
overlap between the 
atoms. The gray regions 
around the atoms indi- 
cate approximately 
e where the wave 
Si function, density, and 
potential are smoothed 
in pseudized methods. 
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Fig. 3. Graphical 
representation of the 
A gauge. The black 
curve depicts the qua- 
dratic energy difference 
between two EOS 

[(E: — E2)*, where 

the subscripts corre- 
spond to the two 

codes shown], and A; 
corresponds to the root- 
mean-square average. 
This is demonstrated by 
the shaded area, which 
is equally large above 
and below the A? line. 


Energy of element i (meV/atom) 


Volume (A? /atom) 


er 
Table 1. Agreement between osmium crystal predictions at nearly identical settings. The top group 
includes A; values for the osmium crystal (in millielectron volts per atom) produced by four APW+lo 
calculations that tried to mimic the same settings as well as possible. These settings are therefore 
different from the ones used for Fig. 4 and reported in tables S3, S4, S8, and S15. The bottom group 
includes the corresponding equilibrium volumes Vo, bulk moduli Bo, and bulk modulus derivatives By. 


EIk FLEUR WIEN2K exciting 
A(Elk) = 0.03 0.02 0.20 
A(FLEUR) “ 0.04 0.22 
A(WIEN2K) 0.18 
ees 


Vo p 
Bo (GPa) 
B, (unitless) 


Table 2. Precision evolution of PAW and pseudopotential sets over time. The A values are expressed 
as an average over the all-electron methods (in millielectron volts per atom) and are listed 
chronologically per code. The corresponding code settings and the DFT-predicted EOS parameters are 
listed in tables S17, S19 to S26, S30, S31, and S33. The most recent potentials are the ones used to 
generate the data shown in Fig. 4. 


Year (A)versus AE 
JTHO1/ABINIT 2013 11 
JTHO2/ABINIT 2014 0.6 
WE oO Cee ee Cgagua pacas ane ees ee 
OTFG7/CASTEP 2013 2.6 
OTFG9/CASTEP 2015 0.7 
Gua pe re ee Bean eee 
GPAW09/GPAW 2012 1.6 
pant oE EE a cc Methyl col ote a ce enacts ees Sg Ree EE ee ane 
PSlib100/QE 2013 1.0 
Pee ee eae Og gees Bar ae ce 
VASP2012/VASP 2012 0.8 
VASPGW2015/VASP 2015 0.6 


mutual A of 1 or even 2 meV per atom can be The above-mentioned differences correspond 


deemed to yield indistinguishable EOS for all 
practical purposes. 
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to the best attainable precision for each all-electron 
code, using highly converged or “ultimate” compu- 


tational settings. However, particular choices for 
these settings may still slightly change the A 
values. It is not always necessary to set such 
stringent requirements, because efficient codes 
are able to perform well with less-than-perfect 
settings. Nevertheless, the difference between 
default- and ultimate-precision EOS may some- 
times reach a few millielectron volts per atom 
(table S2). To eliminate the effect of numerical 
convergence altogether, we used the osmium crys- 
tal to test whether it is possible to obtain exactly 
the same result with different codes. Rather than 
aiming for the best representation of the ideal 
PBE results, as in the rest of this work, the goal in 
this case was to choose input settings as consist- 
ently as possible (using the same basis functions, 
grids, and other parameters). Comparing four 
APW+lo (augmented plane waves plus local or- 
bitals) calculations in this way yielded the results 
in Table 1. Whereas numerical noise in various 
subroutines gives rise to fluctuations of only 
0.02 to 0.04 meV per atom, the larger deviation of 
~0.2 meV per atom in comparisons involving the 
code known as “exciting” can partly be attributed 
to a different scalar-relativistic treatment of the va- 
lence electrons in this code. There is no single uni- 
versal method to account for the relativistic change 
of the electron mass in the kinetic energy. The 
“exciting” code uses the infinite-order regular ap- 
proximation (53), whereas the other three APW+lo 
codes use the Koelling-Harmon scheme (54). A 
third possibility is to use the atomic zero-order 
regular approximation, as was done in the FHI- 
aims code package (tables S5 to S7) (42, 55). 


Comparing (pseudo)potential libraries 


In comparison with all-electron codes, pseudization 
approaches are generally faster, because fewer 
states are considered, and explicit construction 
and diagonalization of the Hamiltonian matrix 
is avoided. Among these, PAW and ultrasoft pseu- 
dopotentials require fewer basis functions than 
the norm-conserving variety, but advanced fea- 
tures such as linear response theory or hybrid 
functionals sometimes may not be available 
because of the increased complexity of the im- 
plementation. However, pseudization approaches 
all perform very well in terms of precision when 
compared with all-electron results (Fig. 4). For 
EOS, the precision of current potentials is able to 
compete with that of all-electron methods, yield- 
ing A values of about 1 meV per atom, with a low 
approaching 0.3 meV per atom. This has not al- 
ways been the case. As suggested by the example 
of silicon (Fig. 1), the available potentials have 
improved considerably over time. In Table 2, it 
can be seen that for several codes, the A value is 
smaller for newer potential sets. Moreover, older 
potentials such as the Troullier-Martins FHI98pp 
norm-conserving set in ABINIT or the Vanderbilt- 
type ultrasoft sets in Dacapo and CASTEP all have 
a substantially larger A (Fig. 4). This evolution is 
evidence of internal quality-control mechanisms 
used by developers of potentials in the past, as 
well as of additional, more recent efforts based 
on the A gauge [e.g., the Jollet-Torrent-Holzwarth 
(JTH) and Standard Solid-State Pseudopotentials 
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average <A> 
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FHI-aims/tier2 


FLEUR 


AE 


FPLO/T+F+s 

RSPt 
WIEN2k/acc 
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Fig. 4. A values for comparisons between the most important DFT methods considered (in 
millielectron volts per atom). Shown are comparisons of all-electron (AE), PAW, ultrasoft (USPP), and 
norm-conserving pseudopotential (NCPP) results with all-electron results (methods are listed in alpha- 
betical order in each category). The labels for each method stand for code, code/specification (AE), or 
potential set/code (PAW, USPP, and NCPP) and are explained in full in tables S3 to S42. The color coding 
illustrates the range from small (green) to large (red) A values. The mixed potential set SSSP was added to 
the ultrasoft category, in agreement with its prevalent potential type. Both the code settings and the DFT- 
predicted EOS parameters behind these numbers are included in tables S3 to S42, and fig. S1 provides a 


full A matrix for all methods mentioned in this article. 


(SSSP) libraries]. The considerable difference 
in the older potentials, even for the predefined 
structures in this relatively simple test set, pro- 
vides a compelling argument to use only the most 
recent potential files of a given code. 

In addition to the comparison with all-electron 
codes, it is also interesting to assess how the 
same PAW or pseudopotential recipes are im- 
plemented in different ways. When both the GPAW 
and ABINIT codes use the GPAW 0.9 PAW set, 
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for example, they agree to within a A of 0.6 meV per 
atom. A similar correspondence is found for the 
SchlipfGygi 2015-01-24: optimized norm-conserving 
Vanderbilt pseudopotentials (ONCVPSP) (0.3 meV 
per atom between Quantum ESPRESSO and 
CASTEP), the Garrity-Bennett-Rabe-Vanderbilt 
(GBRV) 14: ultrasoft pseudopotentials (0.3 meV per 
atom between Quantum ESPRESSO and CASTEP) 
and the GBRV 1.2 set (0.7 meV per atom between 
PAW potentials in ABINIT and ultrasoft poten- 


tials in Quantum ESPRESSO). In this case, too, 
the small A values indicate a good agreement 
between codes. This agreement moreover encom- 
passes varying degrees of numerical convergence, 
differences in the numerical implementation of 
the particular potentials, and computational dif- 
ferences beyond the pseudization scheme, most 
of which are expected to be of the same order of 
magnitude or smaller than the differences among 
all-electron codes (1 meV per atom at most). 


Conclusions and outlook 


Solid-state DFT codes have evolved considerably. 
The change from small and personalized codes to 
widespread general-purpose packages has pushed 
developers to aim for the best possible precision. 
Whereas past DFT-PBE literature on the lattice 
parameter of silicon indicated a spread of 0.05 A, 
the most recent versions of the implementations 
discussed here agree on this value within 0.01 A 
(Fig. 1 and tables S3 to S42). By comparing codes 
on a more detailed level using the A gauge, we 
have found the most recent methods to yield 
nearly indistinguishable EOS, with the associ- 
ated error bar comparable to that between dif- 
ferent high-precision experiments. This underpins 
the validity of recent DFT EOS results and confirms 
that correctly converged calculations yield reliable 
predictions. The implications are moreover rele- 
vant throughout the multidisciplinary set of fields 
that build upon DFT results, ranging from the 
physical to the biological sciences. 

In spite of the absence of one absolute refer- 
ence code, we were able to improve and demon- 
strate the reproducibility of DFT results by means 
of a pairwise comparison of a wide range of codes 
and methods. It is now possible to verify whether 
any newly developed methodology can reach the 
same precision described here, and new DFT 
applications can be shown to have used a meth- 
od and/or potentials that were screened in this 
way. The data generated in this study serve as a 
crucial enabler for such a reproducibility-driven 
paradigm shift, and future updates of available 
A values will be presented at http://molmod. 
ugent.be/deltacodesdft. The reproducibility of 
reported results also provides a sound basis for 
further improvement to the accuracy of DFT, 
particularly in the investigation of new DFT func- 
tionals, or for the development of new computa- 
tional approaches. This work might therefore 
substantially accelerate methodological advances 
in solid-state DFT. 

Future work can examine the reproducibility 
of different codes even further. Such work might 
involve larger benchmark sets (describing differ- 
ent atomic environments per element), other func- 
tionals, an exhaustive comparison of different 
relativistic treatments, and/or a more detailed ac- 
count of computational differences (using data- 
bases or scripts, for example). The precision of 
band gaps, magnetic anisotropies, and other non- 
EOS properties would also be of interest. How- 
ever, the current investigation of EOS parameters 
provides the most important pass-fail test of the 
quality of different implementations of Kohn- 
Sham theory. A method that is not able to reach 
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an acceptable precision with respect to the EOS 
of the elemental crystals will probably not fulfill 
even more stringent demands. 


Methods summary 


This study relied on the collective efforts of a 
large group of developers and expert users to 
make pairwise comparisons of widely used DFT 
codes. We compared 40 DFT methods in terms 
of A, which expresses the root-mean-square dif- 
ference between the EOS of two codes, averaged 
over a benchmark set of 71 elemental crystals 
(Eq. 1). Our approach, including details about 
the codes used, is described further in the sup- 
plementary materials. The reported settings 
yield highly converged results but may not be 
necessary for typical DFT applications. In par- 
ticular, the use of sometimes very small electronic 
smearing widths requires much higher num- 
bers of k-points than routine DFT calculations 
warrant. 
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STRUCTURAL BIOLOGY 


Molecular architecture of the human 
U4/U6.U5 tri-snRNP 


Dmitry E. Agafonov,’* Berthold Kastner,'* Olexandr Dybkov,'* Romina V. Hofele,”’?+ 
Wen-Ti Liu,** Henning Urlaub,”*+ Reinhard Liihrmann,'t Holger Stark**+ 


The U4/U6.U5 triple small nuclear ribonucleoprotein (tri-snRNP) is a major 
spliceosome building block. We obtained a three-dimensional structure of the 
1.8-megadalton human tri-snRNP at a resolution of 7 angstroms using single-particle 
cryo-electron microscopy (cryo-EM). We fit all known high-resolution structures 

of tri-snRNP components into the EM density map and validated them by 

protein cross-linking. Our model reveals how the spatial organization of Brr2 

RNA helicase prevents premature U4/U6 RNA unwinding in isolated human 
tri-snRNPs and how the ubiquitin C-terminal hydrolase-like protein Sad1 likely tethers 
the helicase Brr2 to its preactivation position. Comparison of our model with 
cryo-EM three-dimensional structures of the Saccharomyces cerevisiae tri-snRNP 
and Schizosaccharomyces pombe spliceosome indicates that Brr2 undergoes 

a marked conformational change during spliceosome activation, and that the 
scaffolding protein Prp8 is also rearranged to accommodate the spliceosome’s 


catalytic RNA network. 


he spliceosome is formed stepwise by recruit- 

ment of the U1 and U2 snRNPs (small nuc- 

lear ribonucleoproteins) and the U4/U6.U5 

tri-snRNP, plus numerous other proteins, 

to the pre-mRNA (J). Initially, U1 and U2 in- 
teract with the pre-mRNA’s 5’ splice site (SS) and 
branch site (BS), respectively, generating the A 
complex. The tri-snRNP then joins, leading to for- 
mation of the precatalytic spliceosomal B complex. 
Subsequent catalytic activation of the spliceosome 
involves major structural rearrangements of mul- 
tiple tri-snRNP components (J). 

The 1.8-MDa tri-snRNP is the largest preformed 
building block of the human spliceosome. It con- 
tains three snRNA molecules (U4, U6, and U5), 
two heteroheptameric rings of Sm proteins bound 
to the U4 and U5 snRNAs’ 3’-terminal Sm sites, 
the LSm ring bound to the 3’ end of U6 snRNA, 
plus 16 additional proteins (J) (fig. S1). In the tri- 
snRNP and B complex, U4 and U6 snRNA are 
extensively base-paired. During activation, the U4/ 
U6 duplex is disrupted and a highly structured 
RNA interaction network forms among the U2, 
U6, and U5 snRNAs and the pre-mRNA, generat- 
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ing the spliceosome’s catalytic RNA core (2, 3). 
Three large U5 proteins—Prp8, the RNA helicase 
Brr2, and the guanosine triphosphatase (GTPase) 
Snull4—play key roles during catalytic activation. 
Prp8 is a major scaffolding protein that interacts 
with Brr2 and Snull4 (4) and all reactive sites of 
the intron (5'SS, 3’SS, and BS) and is thus located 
at the heart of the spliceosome’s catalytic core (5, 6). 
Brr2 unwinds the U4/U6 snRNA helices and is the 
major driving force for catalytic activation (7, 8). 
However, as Brr2 and its RNA substrate are pres- 
ent in the tri-snRNP and precatalytic B complex, a 
mechanism must exist to prevent premature dis- 
sociation of the U4/U6 helices by Brr2. 

Here, we report a 3D cryo-electron microscopy 
(cryo-EM) structure of the human tri-snRNP at a 
resolution of 7 A and resolve its spatial organiza- 
tion with the aid of protein cross-linking. Compa- 
rison with the recently reported cryo-EM structure 
of the yeast tri-snRNP (9) reveals unexpected, 
large differences in the position of the helicase 
Brr2, including its position relative to its RNA 
substrate, the U4/U6 duplex. Our model also 
reveals the nature of tri-snRNP rearrangements 
that must occur during spliceosome maturation, 
including a major conformational change within 
the Prp8 protein, which adopts an open confor- 
mation in the human tri-snRNP and a closed one 
in the Schizosaccharomyces pombe spliceosome at 
late stages of splicing (70). 


Structure determination and 
model building 


Human tri-snRNPs were affinity-purified from 
HeLa nuclear extract and prepared for cryo-EM 
by a modification of the GraFix protocol involving 
chemical cross-linking of the particles (fig. SIB) (11). 


The 3D structure was determined from ~141,000 
particle images after several steps of computa- 
tional sorting, starting with an initial data set of 
~1,150,000 selected particle images (fig. S2). The 
calculated 3D structure of the tri-snRNP was de- 
termined at a final overall resolution of 7 A with 
better-resolved parts in the center and somewhat 
lower-resolution areas in the U4/U6 part of the 
structure (fig. S3). Overall, the structure is entirely 
consistent with an earlier, lower-resolution 3D 
structure (12) showing the tri-snRNP as a roughly 
tetrahedral particle with dimensions of approxi- 
mately 300 A x 200 A x 175 A (Fig. 1). At this 
resolution, structured protein domains and double- 
stranded RNA (dsRNA) elements can be identified 
clearly, allowing us to fit known x-ray structures 
or homology models of structured regions of tri- 
snRNP components into the EM density map 
(see table S1 for details regarding how proteins 
were fit into the EM density map). Additionally, 
we performed chemical protein cross-linking of 
purified tri-snRNPs together with mass spectrom- 
etry (CX-MS) (table S2). These data allowed us 
to validate the locations of large tri-snRNP pro- 
teins and facilitated docking of smaller proteins. 
Although we could place all snRNAs and struc- 
tured protein domains in the EM density map in 
a manner consistent with our protein cross-linking 
data, ~30% of the calculated stoichiometric mass 
of human tri-snRNP proteins are very likely in- 
trinsically unstructured regions that could not be 
localized (table S1). 


Structural organization of the U5 Sm 
core and the U4/U6 snRNP 


The helical regions of U4/U6 and U5 snRNA 
allowed their unambiguous placement in the EM 
density map (Fig. 1). The U5 Sm core is located 
at the lower tip of the tri-snRNP, with the 5’- 
terminal msG cap of U5 snRNA positioned close 
to it, whereas U5 loop 1 is located more centrally 
and stems 1b and Ic are coaxially stacked (Fig. 
1B). The U4/U6 snRNAs are located in the upper, 
broader region of the human tri-snRNP. Their 
dsRNA regions are connected by a three-way 
junction and are located in a deeper, solvent- 
accessible cleft. The difference in length of U4/ 
U6 stems I and II and the clearly visible three-way 
junction define the orientation of U4/U6 snRNA 
in the model and indicate coaxial stacking of 
stems I and II (Fig. 1). The U4/U6 snRNAs also 
define the positions of the U4 Sm and U6 LSm 
protein rings, which are found at two corners in 
the upper part of the tri-snRNP (Fig. 1B). 

The geometry of the U4/U6 snRNA three-way 
junction allowed us to fit the crystal structures of 
(i) the U4 snRNA 5’ stem-loop in complex with 
Snul3 and a large part of the U4/U6 Prp31 pro- 
tein, (ii) a large part of Prp3 (Prp3-CTF) in complex 
with U4/U6 stem II and the U6 single-stranded 
3' overhang, (iii) the WD40 domain of the Prp3- 
associated Prp4: protein, and (iv) the cyclophilin 
H (CypH) protein into nearby density elements 
(Fig. 2A). The position of the various U4/U6 proteins 
was confirmed by protein-protein cross-linking 
(fig. S4). There is an overall similarity in the organi- 
zation of U4/U6 proteins in human and yeast 
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tri-snRNPs, with differences in the architectural | for stable tri-snRNP formation (13, 14). Consistent | numerous cross-links, whereby Prp6’s N-terminal 
details of some proteins (fig. S5) (see below). with this, Prp6 forms a bridge across a deep cleft | and C-terminal TPRs exclusively form cross-links 

The Prp6 protein contains 19 tetratrico repeats at the top of the tri-snRNP that connects the U4/ | to U5 and U4/U6 proteins, respectively. Consist- 
(TPRs) in its C-terminal region and is required | U6 and U5 snRNPs (Fig. 2B). This is supported by | ent with intramolecular cross-links between TPR 19 


a 
“" value : 
stem) U45'SL 


U4/U6 3-way junction 


Fig. 1. Three-dimensional cryo-EM structure of the human U4/U6.U5 tri-snRNP and location of U5 and U4/U6 snRNAs and their Sm/LSm cores. 
(A) Different views of the tri-snRNP EM density map with helical high-density elements (blue) representing U5 (in lower region) and U4/U6 snRNA 
(upper region). (B) Position of the U5 Sm, U4 Sm, and U6 LSm cores. A schematic of U5 and U4/U6 snRNA with their Sm/LSm rings is shown. The 
double-stranded regions of U4/U6 and U5, and their heptameric Sm/LSm rings, are modeled into the cryo-EM map. Insets: RNA elements shown 
separately. 
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view 5 


Fig. 2. Structural organization of U4/U6 proteins and Prp6 and their locations in the human tri-snRNP. (A) Positions of the U4/U6 proteins and snRNA. 
Right: Expanded view of boxed region showing the U4/U6 snRNA three-way junction, the crystal structures of Prp31, Snu13, CypH, and the C-terminal fragment 
(CTF) of Prp3, and a modeled structure of Prp4’s WD40 domain, fit into the EM density map. (B) Prp6 forms a bridge connecting the U4/U6 and U5 snRNPs. 
Right: Expanded view of boxed region showing Prp6 TPR repeats and U4/U6 proteins. 
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and TPRs 9 to 13 (fig. S4), the C-terminal TPRs fit 
as a circularly arranged ensemble into a large 
ringlike density that is connected to U4/U6 pro- 
teins (Fig. 2B). 


The architecture of Snu114 and Prp& 


Aside from its 115-residue N-terminal domain, the 
116-kDa Snu114 protein is highly homologous to 
ribosomal elongation factor EF-2/EF-G (15), and 
we could fit domains D1 to D5 of Snu114 into the 
lower part of the tri-snRNP, with D1 and D2 located 
closer to the U5 Sm core and D8 to D5 located more 
centrally (Fig. 3A). Thus, in the isolated human 
tri-snRNP, Snull4 adopts a compact form similar 
to the compact structure of EF-2 (fig. S6) (76). 

The crystal structure of a large fragment of 
yeast Prp8 (~110 kDa) containing a reverse tran- 
scriptase (RT)-like domain, connected through a 
linker region to a restriction endonuclease (En)- 
like domain, fits into a central density element at 
the base of the upper part of the tri-snRNP; the 
En domain points outward and is positioned below 
the U4 Sm core (Fig. 3A). Prp8’s C-terminal RNase 
H (RH)-like domain could be docked into a den- 
sity element located just above the linker region 
of the RT/En domain (Fig. 3A), and its orientation 
was confirmed by cross-linking (tables S1 and $2). 
The architecture of Prp8’s RT/En domain and its 
position are essentially the same in the human and 
yeast tri-snRNP models, whereas Prp8’s RH do- 
main is rotated by ~180° in yeast relative to human 
(fig. S7) (9). 

In the S. pombe spliceosome, Prp8’s N-terminal 
800 amino acids consist of two domains, hence- 
forth termed NTD1 and NTD2, that contain main- 
ly o helices and are separated by a short linker 
region, termed NTDL (Fig. 3A) (10). The larger 
NTD1 structure fits into a density element in the 
lower part of the tri-snRNP model and has a sub- 
stantial interface with Snull4 and also contacts 
stem1 of U5 snRNA (Fig. 3A) (see below). Consist- 
ent with our cross-linking data, the smaller NTD2 
is located more toward the U4/U6 three-way junc- 
tion and interacts with Prp8’s RT domain (Fig. 3B 
and fig. S8A). The crystal structure of Dim1 fits 
into a density element between NTD1 and NTD2 
(Fig. 3B), a position supported by cross-linking 
(fig. S8A). The overall structure of Prp8’s NTD1 
and Snull4 is similar in the human tri-snRNP 
and S. pombe spliceosome, including the lasso- 
like protrusion of NTD1 that interacts with 
Snull4’s D1 domain in a similar manner in both 
complexes (fig. S8B) (10). Guided by multiple 
cross-links of the RecA2 domain of the Prp28 
helicase to Prp8’s NTD1 and RT/En domains 
(fig. S8A), we could fit the crystal structure of 
the two RecA domains into nearby density ele- 
ments (Fig. 3C). Prp28, which is not present in 
isolated Saccharomyces cerevisiae tri-snRNPs, 
exists in an open (inactive) conformation, very 
similar to its conformation in the crystal struc- 
ture of isolated Prp28 (17, 78). Finally, we could 
place the WD40 domain of U5-40K, which is 
conserved in S. pombe but not S. cerevisiae, into 
a density element close to U5’s ILSI (fig. S8C)—a 
position where it is also found in the S. pombe 
spliceosome (10). 
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Brr2 helicase is found at very 

different positions in human and 

yeast tri-snRNPs 

The 245-kDa RNA helicase Brr2 contains two 
tandemly organized helicase cassettes, but only 
the N-terminal cassette (NC) actively unwinds 
the U4/U6 duplex during catalytic activation 
(19). The C-terminal Prp8-Jab1 domain binds 
tightly to Brr2’s active NC and regulates its helicase 
activity (20, 21). The crystal structure of the 
complete 200-kDa helicase unit of Brr2 bound 
to Prp8 Jab] fits very well, as a rigid body, into a 
major density element in the upper part of the 
tri-snRNP, near the RT end of the Prp8-RT/En 
domain, opposite the U4 Sm and U6 LSm rings 
(Fig. 4A). 

Besides this Brr2 NC-Prp8-Jab1 interaction, 
there appear to be at least two additional density 
elements connecting the helicase cassettes to other 
tri-snRNP proteins (Fig. 4, A to C). The N-terminal 
region of Brr2 contains a noncanonical PWI do- 
main (22) and a helical domain (23). The PWI 
domain fits into the density element connecting 
Brr2’s C-terminal cassette (CC) to Snull4 and Sad1 
(Fig. 4B), while the N-terminal helical domain 


Snu114 domains 
o Sc 


Di 02 D3 D4 DS 


view 1 


Prp8 domains 


Prp28 domains 


ee 
TRocAt RecA? 


(NHD) fits into a density element connecting Brr2’s 
NC to Prp8’s RH domain and to the N-terminal- 
most three TPR repeats of Prp6 (Fig. 4C and table 
S81). Interestingly, Brr2’s NHD is located in front 
of the RNA binding channel between the RecA2 
and helical bundle (HB) domain of Brr2’s NC (Fig. 
4C), consistent with this element acting like a plug, 
autoinhibiting Brr2 via substrate recognition (23). 
Brr2’s architecture and its connections to the above- 
mentioned proteins were confirmed by a network 
of cross-links between Brr2’s NHD and NC/CC 
domains, and between these domains and the 
Prp8’s RH and Jab1 domains, as well as Prp6’s 
N-terminal TPRs (fig. S9), and additionally be- 
tween Brr2’s PWI and CC domains and the Snul14 
and Sad1 proteins (fig. S10). 

Strikingly, Brr2 is located at radically different 
positions in the human and yeast tri-snRNP mod- 
els (Fig. 4D). Human (h)Brr2 (bound to hPrp8 Jab1) 
is located close to the N-terminal TPR repeats of 
Prp6 and Prp8’s RT end, and its general position 
in the tri-snRNP is not dependent on the use of a 
chemical cross-linking reagent during EM sample 
preparation (fig. S11). In contrast, yeast (y)Brr2 
(bound to the yPrp8 Jab1 domain) is found near 


Fig. 3. Structures and posi- 
tions of Snull14, Prp8, Dim1, 
and Prp28 in the human tri- 
snRNP. (A) Location and 
structural organization of 
Snull4 and Prp8. Left: Organi- 
zation of Snull4 (domains D1 
to D5 homologous to EF-G/ 
EF-2) and Prp8 (NTD1 and 
TD2, N-terminal domains 
land 2; NTDL, NTD linker; RT, 
reverse transcriptase-like; X/L, 
linker; En, endonuclease-like; 
RH, RNase H-like; Jabl, Jab1/ 
PN-like). Upper right: Fit of 
Snull4 domains D1 to D5 as a 
compact structure. Lower right: 
Fit of Prp8 NTD1, RT/En, and 
RH domains, with front global 
clipping to improve Prp8 visi- 
bility. (B) U5 Dim1 between 
Prp8’s NTD1 and NTD2 
domains (extended view at 
right). (©) Structural organiza- 
tion of the RNA helicase Prp28. 
Left: Domain organization of 
Prp28. Right: Expanded view of 
dashed box labeled C in view 

1 of (A), showing Prp28’s RecA 
domains fit into two 
neighboring density elements. 
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yPrp8’s En domain, which is ~20 nm away from 
the position of hBrr2. In addition, it is rotated by 
~180° around the long axis of the tri-snRNP (Fig. 
4D and fig. S12). In the yeast model, yBrr2 appears 
to be connected to the tri-snRNP primarily via the 
yPrp8 Jab1 domain (which contacts the tip of 
yPrp8’s EN domain) and the U4: Sm core (Fig. 4D 
and fig. S12) (9). Unfortunately, because of the 
less well-defined density at the interface between 
yBrr2 and other tri-snRNP proteins, the locations 
of yBrr2’s N-terminal PWI and helical domains 
cannot be identified in the yeast structure. Another 
striking difference is that the yeast U4 Sm core is 
located at the interface between yBrr2’s helicase 
cassettes, and the central single-stranded region 
of U4 snRNA, to which Brr2 is thought to dock 
prior to unwinding the U4/U6 duplex (fig. S1C) 
(24), is positioned at/near the RecA domains of 
the active NC of yBrr2 (Fig. 4D) (9). In contrast, 
in the human tri-snRNP structure, hBrr2’s active 
NC is located 8 to 10 nm away from its U4/U6 
snRNA substrate (Fig. 4D). 


Structural basis for how Sad1 likely 
tethers Brr2 in a preactivation position 


The very different position of Brr2 suggests either 
that there is a substantial difference in the spatial 
organization of the yeast and human tri-snRNPs, 
or potentially that the human and yeast structures 
represent two different conformational states that 
are obtained by rearrangements in protein archi- 
tecture. Although the first possibility cannot be 
rigorously excluded, we consider it unlikely, as the 
structures of Brr2 and all other major U5 proteins 


Brr2 pail Prp8-Jab 


NHD PW 


are evolutionarily highly conserved between yeast 
and human (J, 5). Instead, differences in the pro- 
tein composition of the purified human and yeast 
tri-snRNPs potentially lead to different conforma- 
tions. That is, in the presence of adenosine tri- 
phosphate (ATP), isolated human tri-snRNPs are 
stable, whereas yeast tri-snRNPs are not (9, 25, 26). 
This is likely because the evolutionarily conserved 
Sad1 protein is stoichiometrically present in pu- 
rified human tri-snRNPs (25) but is lost during 
purification of the yeast complex (26, 27). Sad1 
plays a key role in stabilizing the tri-snRNP, as 
depletion of Sadi from yeast cell extracts leads to 
dissociation of the otherwise stable tri-snRNP in 
an ATP- and Brr2-dependent manner into a U4/ 
U6 di-snRNP (where U6 and U4 are still base- 
paired) and U5 snRNP (28). Consistent with it 
contributing to tri-snRNP stability, human 
Sad1 is located at a strategically important po- 
sition at the interface between the U4/U6 and U5 
snRNPs. The Sad1 UCH domain contacts U4,/U6- 
Prp31 and the Prp8 NTD2 and RT domains, where- 
as Sad1’s Zf-UBP domain has a substantial interface 
with domains D2, D3, and D4 of Snull4 and is 
tightly connected to Brr2’s PWI domain (Fig. 5, 
table SI, and fig. S10). 

Thus, Sad1 not only potentially acts as a clamp 
stabilizing the interaction of U4/U6 and U5, it 
might also help to tether Brr2 in a preactivation 
position (i.e., away from the U4/U6 duplex) with- 
in the human tri-snRNP. This in turn suggests 
that dissociation of Sad1—as observed during ac- 
tivation of the human B complex (7)—might allow 
Brr2 to undergo a major conformational change 


cryo-EM density 


that is required for it to interact with its U4/U6 
snRNA substrate. Because Sad1 is absent from 
purified yeast tri-snRNPs, the very different po- 
sition of Brr2 in the yeast tri-snRNP may there- 
fore represent a conformational state similar to 
the one that Brr2 normally adopts at a later stage 
during spliceosome activation. Whereas the yeast 
cryo-EM model lacks density in the correspond- 
ing regions where Sad1 and Brr2 are located in the 
human tri-snRNP structure, the crystal structures 
of Sadi and Brr2 can be docked well onto the sur- 
face of the yeast tri-snRNP at the corresponding 
positions (fig. S13). It will be of interest to de- 
termine the 3D structure of the yeast tri-snRNP 
in the presence of ySad1. 


Remodeling of the human tri-snRNP 
during spliceosome assembly 
and activation 


The spatial architecture of the human tri-snRNP 
provides important insight into the function of 
several proteins and also reveals the likely dock- 
ing site of the tri-snRNP with the spliceosomal A 
complex during B complex formation. That is, 
the 3’ end of U6 and Prp8’s RH domain, which 
interact with U2 snRNA to form U2/U6 helix IIT 
(fig. SIC) and with the pre-mRNA’s 5'SS, respec- 
tively, during A complex docking, are located at 
accessible positions at the “top” of the tri-snRNP 
(fig. S14A), consistent with the general architec- 
ture of the spliceosomal B complex previously 
revealed by EM (29). 

The architecture of the human tri-snRNP also 
indicates that several of its proteins and RNA 


S. cerevisiae cryo-EM density 
(EMD-2966) 


Fig. 4. Structure and location of the RNA helicase Brr2 and Prp8 Jab1 domain. 
(A) Location and structural organization of Brr2. Left: Organization of hBrr2. NHD, 
N-terminal helical domain; PWI, N-terminal, noncanonical PWI domain; NC/CC, 
N-terminal/C-terminal helicase cassette. Right: Expanded view showing fit of hBrr2’s 
helicase region in complex with Prp8-Jabl1. Circles: Brr2-Jab1 interface (black oval), 
additional density elements connecting Brr2’s NC (green circle) and CC (red circle) 
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to the tri-snRNP [see (B) and (C)]. (B) Right: Expanded view showing fit of Brr2's 
N-terminal PWI domain [red circle in (A)]. (©) Expanded view showing fit of 
Brr2’s NHD [green circle in (A)]. (D) Brr2 is located at radically different 
positions in the human and yeast tri-snRNPs, and is found at opposite ends of 
Prp8's RT/En domain in the two models. The cryo-EM densities of the human (left) 
and yeast (right) tri-snRNPs (9) are shown in corresponding views as insets. 


25 MARCH 2016 * VOL 351 ISSUE 6280 1419 


RESEARCH | RESEARCH ARTICLE 


Brr2 
NTDe PWI 
Prps 
NTD1 Y 


Us SL1 D3 Game 
aO- ~Snui14 V9 


PR 


Fig. 5. Sadi is located in a position bridging U5 and U4/U6 proteins. (Left) Fit of Sadl’s ubiquitin 
C-terminal hydrolase (UCH)-like domain (including linker) into a density element that is connected to 
several U5 proteins and the U4/U6 protein Prp31. (Right) Sad1's ubiquitin protease (Zf-UBP)-like 
domain fits into a neighboring density that is connected to Snull4’s D2-D4 domains and Brr2’s PWI 


domain. 


elements must undergo major, sequential con- 
formational changes during B complex formation 
and spliceosome activation. One major re- 
arrangement concerns Prp28, which catalyzes 
the transfer of the 5'SS from U1 to the ACAGA box 
of U6 snRNA. As this likely occurs at the Prp8 RH 
domain (30), Prp28 must move from its outward 
position through the cleft between Brr2 and the 
U4 Sm domain toward the RH domain (fig. 
$14A). In fact, the Prp28 “stalk” appears to be 
intrinsically flexible and undergoes movements 
within the isolated tri-snRNP consistent with 
this proposed rearrangement (37). For catalytic 
activation of the spliceosome, Brr2’s NC and 
the U4/U6 duplex must be juxtaposed. This 
could be achieved by movement of Brr2’s heli- 
case domain across the cleft between Brr2 and 
the U4 Sm core toward the U4/U6 snRNAs 
(fig. S14A). 

Additionally, Prp8 appears to undergo a sub- 
stantial structural change during spliceosome 
activation. That is, whereas the overall structure 
of Prp8’s large N-terminal NTD1 domain is sim- 
ilar in the human tri-snRNP and S. pombe splice- 
osome models, the RT/En domain adopts a clearly 
different position in both complexes (figs. S14B 
and S15) (0). In the tri-snRNP it points upward, 
whereby the tip of the En domain is ~5 nm away 
from NTDI, resulting in an open conformation. In 
contrast, in the S. pombe spliceosome, Prp8 adopts 
a closed conformation where the En domain in- 
teracts closely with NTD1 (figs. S14B and S15). As 
the overall structure of the RT/En domain does 
not change, Prp8 achieves the closed conformation 
by a downward movement of the RT/En domain, 
whereby the pivoting point appears to be located 
at the interface between the RT and NTD1 do- 
mains (figs. S14B and S15A). The position of Prp8’s 
RH domain undergoes a similar downward shift 
(fig. S14B). This structural change within Prp8 is 
required to create the pocket into which the re- 
arranged catalytic U2/U6 RNA network and U5 
snRNA loop 1 are docked in the S. pombe splice- 
osome (fig. S15, B and C) (32). Interestingly, the 
U5 snRNA loop 1, which also interacts with the 3’ 
end of the pre-mRNA’s 5’ exon in the activated 
spliceosome (33), is already located in the tri- 
snRNP near Prp8’s emerging active-site region, 
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and thus it must not be substantially repositioned 
(fig. S14B). 

The aforementioned rearrangements can only 
occur when several proteins are displaced con- 
comitantly from their positions in the tri-snRNP. 
For example, in the tri-snRNP, Dim] is located in 
the same area where the center of the U2/U6 
catalytic RNA network is found in the S. pombe 
spliceosome (fig. S15, B and C) (32). Possibly 
Dim1 and the RecA2 domain of Prp28, which are 
both located between Prp8’s RT/En and NTD do- 
mains (fig. S15B), may stabilize the open confor- 
mation of Prp8 in the tri-snRNP. Prp31, Prp3, and 
Prp4: must also be displaced from the U4 and/or 
U6 snRNAs. Indeed, except for Prp8, all of these 
proteins, plus Sadi and Prp6, are displaced from 
the spliceosome during activation (1). How these 
multiple rearrangements are orchestrated is cur- 
rently not clear. Snul14 has been implicated in the 
activation process (34), and if it should undergo 
a conformational switch from a compact to an 
elongated state, similar to EF-2/EF-G in the ri- 
bosome during translocation (16, 35, 36), several 
coordinated movements of other tri-snRNP pro- 
teins would result (figs. S6 and S14A). For ex- 
ample, Brr2’s PWI domain, which (together with 
Brr2’s NHD) provides major contact points be- 
tween Brr2 and other U5 proteins as well as 
Sadi, would likely be destabilized; this could 
potentially facilitate movement of Brr2 toward 
U4/U6. The elucidation of the structural dynam- 
ics of the various events that take place during 
spliceosome activation will require numerous cryo- 
EM “snapshots” of the spliceosome during its 
multistep assembly pathway. 
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C-H BOND ACTIVATION 


Catalyst-controlled selectivity in 
the C-H borylation of methane 


and ethane 


Amanda K. Cook, Sydonie D. Schimler, Adam J. Matzger, Melanie S. Sanford* 


The C-H bonds of methane are generally more kinetically inert than those of other 
hydrocarbons, reaction solvents, and methane functionalization products. Thus, developing 
strategies to achieve selective functionalization of CH4 remains a major challenge. Here, we 
report transition metal-catalyzed C—H borylation of methane with bis-pinacolborane (Bzpinz) in 
cyclohexane solvent at 150°C under 2800 to 3500 kilopascals of methane pressure. Iridium, 
rhodium, and ruthenium complexes all catalyze the reaction. Formation of mono- versus 
diborylated methane is tunable as a function of catalyst, with the ruthenium complex providing 
the highest ratio of CH3Bpin to CH2(Bpin)2. Despite the high relative concentration of 
cyclohexane, minimal quantities of borylated cyclohexane products are observed. Furthermore, 
all three metal complexes catalyze borylation of methane with >3.5:1 selectivity over ethane. 


ver the past 50 years, numerous homoge- 
neous transition-metal catalysts have been 
developed for the C-H functionalization 
of liquid alkanes [for example, via de- 
hydrogenation (7), oxygenation (2), carbon- 
ylation (3), borylation (4-7), and C-, N-, and O-atom 
insertion (8, 9)]. However, relatively few of these 
catalysts have been translated to analogous re- 
actions of methane (0-14). This is largely due 
to the particular challenges associated with meth- 
ane C-H functionalization. First, the C-H bonds 
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of methane are stronger than those of most liq- 
uid alkanes [the C-H bond dissociation energies 
(BDEs) of methane, 7-hexane (1°C-H), and cyclo- 
hexane are 105, 101, and 99.5 kcal/mol, respective- 
ly (5, 16)]. As such, methane C-H bond cleavage 
is prohibitively slow with many catalysts. Second, 
homogeneous alkane functionalization reactions 
are typically conducted by using neat alkane as 
the solvent (4, 5, 14), so the use of methane gas 
as a substrate poses challenges with respect to 
identifying a compatible reaction solvent (12, 17). 
Last, the reaction solvent and the CH3X products 
of methane functionalization typically contain 
more reactive C-H bonds than those of CH,. As 
such, developing strategies to achieve selective 


A Proposed selective mono-C-H borylation of methane: 


XN 7 

H3;C-H_ + BB. 
O Oo 

(Bzping) 


B Selectivity challenges: 


H;C—tH) 


Desired reactant 


sterically activated 


(most sterically accessible C—-H bond) 


(CH3Bpin) 


; 
see 


Initial C-H borylation product 


electronically activated 
(most acidic C—H bond) 


functionalization of CH, in the presence of solvent 
and CH;X remains a challenging problem (10-14). 

We sought to identify a methane C-H function- 
alization process in which selectivity (both for 
CH, versus CH3X functionalization and for CH, 
versus solvent C-H functionalization) could be 
tuned through modification of the homogeneous 
transition-metal catalyst. To accomplish this goal, 
we focused on the catalytic C-H borylation of meth- 
ane with Bopin, (Fig. 1A). Over the past 15 years, 
there has been tremendous progress in the devel- 
opment of transition-metal catalysts for the C-H 
borylation of liquid alkane substrates. Catalysts 
based on iridium (Ir) (78, 19), rhodium (Rh) (20-22), 
rhenium (Re) (23), and ruthenium (Ru) (24) have 
been reported for liquid alkane C-H borylation, typ- 
ically by using the alkane substrate as the solvent 
and B,pin, as the borylating reagent (19, 21, 23-25). 
With the vast majority of liquid alkane substrates, 
the selectivity of C-H borylation is dominated by 
steric factors, with terminal (primary) C(sp*)-H 
bonds undergoing selective functionalization over 
secondary or tertiary C-H sites (25, 26). This se- 
lectivity has been reported to be largely indepen- 
dent of the nature of the transition-metal catalyst. 
For example, the C-H borylation of n-alkanes 
(n-C,Hon+2) With Bopin, affords 1-Bpin-C,,Hoy 41 
as the sole detectable product with Ir-, Rh-, Re-, 
and Ru-based catalysts (18, 20, 23, 24). 

In certain contexts, the introduction of a Bpin 
substituent has been shown to electronically acti- 
vate adjacent C-H bonds toward further C-H 
borylation by rendering them more acidic (27, 28). 
This electronic activation has been best studied 
in the context of benzylic substrates, in which the 
C-H borylation of 1°-benzylic C-H bonds is often 
slower than that of the 2° a-boryl benzylic C-H 
bonds of the products (29, 30). However, the inter- 
play between these steric and electronic effects 
has not been extensively explored in the C-H 
borylation literature, especially as a function of 
catalyst metal identity. As discussed below, these 


Reaction solvent 


Statistically favored 
(highest concentration C—H bond) 


Fig. 1. Reactivity and selectivity challenges in the C—-H borylation of methane. (A) Proposed methane C-H borylation reaction. (B) Challenges with 


selectivity in methane C-H borylation. 
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issues are expected to be particularly salient in 
the context of methane C-H borylation (Fig. 1B). 

In 2005, Hall and co-workers reported density 
functional theory (DFT) calculations that suggest 
that Cp*Rh complexes (Cp*, pentamethylcyclo- 
pentadienyl) should be capable of catalyzing the 
C-H borylation of CH, (22). Despite these encour- 
aging computational results, there have been no 
subsequent experimental studies establishing the 
feasibility and/or exploring the selectivity of meth- 
ane C-H borylation with these or any other catalysts. 
Ina methane C-H borylation reaction, three major 
C-H bond-containing molecules will be present 
in solution: methane, CH;Bpin, and solvent (cyclo- 
hexane) (Fig. 1B). Among these three molecules, 
methane has the most sterically accessible C-H 
bonds, CHsBpin has the most electronically ac- 
tivated (acidic) C-H bonds, and the reaction sol- 
vent, cyclohexane, is statistically favored because 
of its high concentration. Our studies sought to 
(i) experimentally establish the feasibility of metal- 
catalyzed methane C-H borylation; (ii) determine 
which factor (or factors) dominate selectivity in 
this transformation (sterics, electronics, or statis- 
tics); and (iii) probe whether different catalysts 
can be used to tune the selectivity of the reaction. 

We selected Rh complex 1 for our initial in- 
vestigations of methane C-H borylation on the 
basis of Hall and co-workers’ DFT calculations, which 
predicted a relatively low barrier for CH, acti- 
vation with this complex (22). The initial reactions 
were conducted in a Parr high-pressure batch 
reactor (45 mL volume) at 150°C, using 1.5 mole 
percent (mol %) of 1, 3500 kPa of methane, and 
0.89 mmol of Bzpin, as the limiting reagent (37). 
As discussed above, the choice of solvent is par- 
ticularly critical because any C-H bonds in the 
solvent must be less reactive with 1 than those of 
CH,. Thus, we first examined solvents without C-H 
bonds [perfluoromethylcyclohexane (PFMCH) and 
perfluorohexane (PFH)]. However, modest yields 
of methane C-H borylation products were ob- 
tained (Table 1, entries 1 and 2), likely because of 
the low solubility of the Rh catalyst in these me- 
dia. We next examined cyclopentane (c-C;Hyjo) 
and cyclohexane (c-CgHj2) as solvents (Table 1, 
entries 3 and 4). These cycloalkanes are both 
known to be poor substrates for Rh-catalyzed 
C-H borylation (6, 20, 21) because the 2°C-H 
bonds are relatively sterically congested and 
weakly acidic (32). Cyclohexane proved to be op- 
timal, affording CH;Bpin in 74% yield with only 
traces (~2%) of the solvent C-H borylation product 
c-CgH,,Bpin (Table 1, entry 4). Under these condi- 
tions, high selectivity was also observed for the 
mono-borylation of methane [ratio of CH,Bpin to 
bis-borylated CH.(Bpin). was 10:1]. Increasing the 
loading of catalyst 1 to 3 mol % resulted in 99% yield 
of CH3Bpin, while maintaining excellent selectiv- 
ity for CH3Bpin over c-Cg,H,,Bpin and CH.(Bpin). 
(entry 6). Lowering the catalyst loading to 0.75 
mol % resulted in decreased yield of CH;Bpin 
(51%) but increased turnover number (68 turn- 
overs) (entry 5) relative to the standard conditions. 

We next examined Ir and Ru complexes 2/3 and 
4:as potential catalysts for methane C-H borylation. 
These complexes were selected on the basis of their 
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Table 1. Methane C-H borylation catalyzed by 1. 


re f2 cat. 1 
H3C-H + i 
. coe solvent 
3500 kPa 150°C, 14h 
Entry Mol%1_ Solvent Yield* 


va 
E O-p oo Rh 
CH3Bpin + HBPin + C) # ee” ass 
‘o “st 


TON CHszBpin:solventBpin* 


Sa 


(c-CgH,;Bpin) — (CH(Bpin),) 


CH3Bpin:CH2(Bpin)2* 


*Yields and ratios of all products were determined by means of gas chromatography—flame ionization 


detector (GC-FID) and are based on Baping as the limiting reagent. 


tna, not applicable. 


Table 2. Impact of catalyst on the yield and ie= of methane C-H borylation. 


; catalyst CH3Bpin + HBpin 
H3C-H + Bgping A 
-CgH 12 F 7 
3500 kPa 150°C, 14h CH,(Bpin), + ¢-CgH,,Bpin 


Entry Catalyst Time(hours) Yield* 


1 2/3) 14 


45% 


TON CH;Bpin:CyBpin* 


15 


oP se es as. 
= i ~Bpin oon 


Ae Cl 


CH3Bpin:CH2(Bpin)2* 


3:1 Al 


*Yields and ratios of all products were determined by means of GC-FID and are based on Bapinsz as the 


limiting reagent. 


+Reactions stopped before completion in order to compare selectivity at similar yields. 
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known catalytic activity for the C-H borylation of 
liquid alkanes (78, 19, 24, 33, 34). Under the opti- 
mal conditions for catalyst 1, the combination of 
Ir complex 2 and ligand 3 (18) afforded moderate 
yield (45%) of CH3Bpin, whereas Ru complex 4 
provided 67% yield (Table 2, entries 1 and 2). To 
more quantitatively compare these three catalysts, 
reaction progress was monitored as a function of 


time, and the results are shown in Fig. 2. These 
studies show that all of the reactions achieve a 
maximum yield within 10 hours. However, the ini- 
tial reaction rate with Rh catalyst 1 is approximate- 
ly four times faster than that with 2/3. Furthermore, 
4: displays a lengthy induction period (~ 2 hours), 
suggesting that it serves as a precatalyst for this 
transformation (24, 35). 


sciencemag.org SCIENCE 


RESEARCH | REPORTS 


CH, +  CHsBpin 
3500 kPa 1 equiv 
A 
Ir] complex 2/3 
80 tr P 180 


B CH,(Bpin), 
formation 


@ CH,Bpin 
formation 


% Yield CH,(Bpin), 
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Fig. 3. Evaluation 

of the selectivity 

in CH4/CH3Bpin 
borylation. (A to C) Re- 
action time profiles 
(top) for (A) Ir catalyst 
2/3, (B) Rh catalyst 1, 
and (C) Ru catalyst 4. 
Red squares (left y axes) 
represent formation of 
CH2(Bpin)>2, and blue 
circles (right y axes) 
represent formation of 


CHzBpin : CHa(Bpin)2 = 4.5 + 0.4: 1 
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formation 
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“HH + — BoPing B + cCgHy,Bpin + HBpin 
c-CgHy2 [- ‘o 
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3500 kPa 1 equiv we (CH3CH2BPin) 
catalyst yield CH;,CH2Bpin CH3CH,Bpin:c-CgH,2 
2/3 48% 5:1 
1 78% 65:1 
4 21% >100:1 
B 
H3;C—-H F 3 mol % catalyst ; ; 
B Ping ————> _ CH3BPin + CH3CH,BPin + HBpin 
+ c-CgHy2 
. toy 
“~H 1 equiv 180" catalyst selectivity factor 
3500 kPa 2/3 4.2x 
(total) 1 6.1x 
4 3.6x 


Fig. 4. Comparison of catalysts for ethane borylation. (A) Batch ethane borylation results for catalysts 
2/3, 1, and 4. (B) Methane and ethane one-pot competition. Selectivity factor is the preference for meth- 
ane over ethane borylation corrected for statistics and solubility. 


In the Table 2 data, the choice of catalyst has a 
major impact on the selectivity of C-H borylation, 
both for methane versus cyclohexane and for 
methane versus CH3Bpin. In particular, Rh and 
Ru catalysts 1 and 4 exhibit much higher selec- 
tivity for CH, than does the Ir catalyst system 2/3. 
This effect is observed even when the reactions 
are stopped at similar yield of CH;Bpin (~50% yield; 
Table 2, entries 1, 3, and 4 for comparison). The 
ratio of CH;Bpin to c-CgsHy,,Bpin ranged from 82:1 
(with catalyst 4) to 3:1 (with catalyst 2/3). Simi- 
larly, the CH3Bpin to CH(Bpin), ratio ranged from 
31:1 (with catalyst 4) to 4:1 (with catalyst 2/3). These 
results indicate that tuning of the catalyst struc- 
ture can be used to control this undesired over- 
functionalization reaction. 

To more quantitatively evaluate selectivity 
as a function of catalyst, we conducted compe- 
tition experiments between CH, (3500 kPa, 1.1 M) 
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(36) and CH3Bpin (0.13 M, 1 equivalent relative to 
Bopin,) with each of the catalysts 1, 2/3, and 4. 
The time course of each reaction is shown in Fig. 
3. The yield of CH3Bpin (Fig. 3, blue circles, right 
y axes) represents the additional CH,Bpin formed 
from the C-H borylation of CH, (measured in 
excess of 100%, given the CH3Bpin equivalent 
added at the outset), whereas the yield of CHy 
(Bpin), (Fig. 3, red squares, left y axes) represents 
the product of C-H borylation of CH;Bpin. 
With Ir catalyst 2/3 (Fig. 3A), the quantity of 
diborylated product present exceeds that of 
CHsBpin at all time points. In contrast, the con- 
centration of CH,Bpin is much greater than that 
of CH,(Bpin), throughout the reactions catalyzed 
by Rh complex 1 and Ru complex & (Fig. 3, B and 
C, respectively). Using the ratio of CH;Bpin:CH, 
(Bpin), obtained at early time points and the 
concentrations of CH, and CHBpin added at the 


10 15 20 25 0 5 10 
i Time (h) 


CH3Bpin : CHo(Bpin)s = 15+2:1 
KcHa/KcH3Bpin =1.740.2 
AAG? = -0.5 + 0.1 kcal-mol™ 


CH3Bpin. CH3Bpin: 
CH2(Bpin)2 ratios at 
early time points, relative 
rates, and AAG# values 
for the three catalyst 
systems are given below 
their respective time 
profile graphs. 


15 20 25 


onset, we can estimate koy4/Kcusppin (and thus 
approximate AAG") for the C-H borylation of CH, 
versus CHsBpin for each catalyst (complete details 
of these calculations are provided in the supple- 
mentary materials). As shown in the bottom of 
Fig. 3, positive AAG* values are observed for cat- 
alysts 1 and 2/3, reflecting faster C-H borylation 
of CH;Bpin versus CH,. The values of AAG! are 
estimated as 0.53 and 2.48 kcal/mol for 1 and 2/3, 
respectively. In contrast, Ru catalyst 4 shows a 
reversal in selectivity, exhibiting a preference for 
CH, over CH3Bpin, with an estimated AAG of 
-0.5 kcal/mol. 

The relative reactivity of methane and ethane 
is another important issue (given that ethane is 
the secondmost abundant component of natural 
gas) but is rarely addressed in alkane C-H func- 
tionalization reactions. In the few reported sys- 
tems in which this has been studied, ethane is 
usually found to be much more reactive (17, 37-39). 
As shown in Fig. 4A, catalysts 1, 2/3, and 4& all 
catalyze the C-H borylation of ethane at 150°C in 
cyclohexane. Again, ethane borylation occurs in 
preference to cyclohexane borylation and shows 
a similar dependence on metal catalyst as with 
methane, with selectivities ranging from 5:1 (with 
2/3) to >100:1 (with 4). 

To probe catalyst selectivity for methane versus 
ethane, known molar quantities of each gas were 
added to the high-pressure reactor. The reactions 
were run to complete conversion of Bypin,, and 
the ratio of CH;Bpin to CH3;CH.Bpin was deter- 
mined with each catalyst. These ratios were then 
corrected for the number of C-H bonds in each 
substrate as well as the relative solubilities of the 
two gases (36). As shown in Fig. 4B, all three cat- 
alysts exhibit a >3.5:1 preference for the C-H 
borylation of methane relative to ethane, which 
is consistent with sterically controlled selectivity. 
Additionally, the level of selectivity varies with 
the catalyst. The Ir catalyst 2/3 and Ru catalyst 4: 
both react approximately fourfold faster with 
methane C-H bonds, whereas 1 is more selective 
for methane (approximately sixfold faster). These 
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results further highlight the impact of catalyst 
on both reactivity and selectivity in the C-H 
borylation of light alkanes. 

Overall, we have demonstrated that catalyst 
structure has a major impact on reaction rates 
and selectivities in the C-H borylation of meth- 
ane. Over-functionalization of the initial product, 
CHsBpin, can be limited through the appropriate 
selection of catalyst. These results open up ex- 
citing possibilities for catalyst design (to further 
modulate reactivity and selectivity in methane 
C-H borylation) as well as the application of the 
concepts delineated here for other light alkane 
C-H functionalization reactions. 
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C-H BOND ACTIVATION 


Catalytic borylation of methane 


Kyle T. Smith,’ Simon Berritt,’ Mariano Gonzalez-Moreiras,’ Seihwan Ahn,”* 
Milton R. Smith III,** Mu-Hyun Baik,”*?* Daniel J. Mindiola‘* 


Despite steady progress in catalytic methods for the borylation of hydrocarbons, methane has 
not yet been subject to this transformation. Here we report the iridium-catalyzed borylation of 
methane using bis(pinacolborane) in cyclohexane solvent. Initially, trace amounts of borylated 
products were detected with phenanthroline-coordinated Ir complexes. A combination of 
experimental high-pressure and high-throughput screening, and computational mechanism 
discovery techniques helped to rationalize the foundation of the catalysis and identify improved 
phosphine-coordinated catalytic complexes. Optimized conditions of 150°C and 3500- 
kilopascal pressure led to yields as high as ~52%, turnover numbers of 100, and improved 
chemoselectivity for monoborylated versus diborylated methane. 


ctivation of methane is challenging because 
it is nonpolar, has strong sp? C-H bonds, is 
sparingly soluble in both polar and non- 
polar solvents, and has very high ioniza- 
tion energies and very low triple, boiling, 
and flashing points (7-8). Homogeneous catalysts 
that convert methane to products that could be 
used as liquid fuels are known, but these sys- 
tems often require strong electrophiles and, in 
some cases, superacids and/or powerful oxidants 
C1, 2, 9-17). Chemoselectivity is another limita- 
tion in methane activation and functionalization. 
For instance, HzC-R (R = functional group) pro- 
ducts resulting from methane activation and func- 
tionalization have more reactive C-H bonds than 
methane itself, hence often resulting in poor se- 
lectivity, overfunctionalization, and overoxidation. 
The pioneering work by Hartwig, Marder, and 
Smith on C-H bond borylation inspired our in- 
vestigation into the catalytic functionalization of 
methane using a similar approach (78). Whereas 
stoichiometric and catalytic borylations of al- 
kanes show marked selectivity for monoborylation 
of terminal methyl groups (18), analogous reac- 
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tions with methane have not been thoroughly 
explored, despite this reaction being known for 
more than a decade. Fundamentally important is 
that the methyl-derived product is arguably a 
form of a mildly nucleophilic methyl transfer re- 
agent, which complements the chemistry observed 
in electrophilic activation reactions in Shilov-type 
chemistry (9). Theory predicts that borylation of 
hydrocarbons with a borane (Eq. 1) is thermo- 
neutral, whereas the weaker B-B bond in diboron 
reagents provides an enthalpic driving force of 
at least 12 kcal/mol, as shown in Eq. 2 (8). These 
considerations led us to pursue the catalytic 
borylation of methane using diboron reagents 
such as Bypins (pin = pinacolate). 


H3C-H + H-B(OR), — H3C-B(OR), + 


H-H AH, = -1to +1 kcal/mol (1) 


H3C-H + (RO),B-B(OR), — H3C-B(OR), + 
H-B(OR), AH, = -13 kcal/mol (2) 


Iridium systems are particularly promising for 
C-H activation of methane (J, 2), and some of the 
most active borylation catalysts use this transition 
metal (18). Therefore, we focused our attention 
on the commercially available iridium reagents 
[Ir(COD)(u-Cl)]Jo, [Ir(COD)(u-OMe)], (COD = 1,5- 
cyclooctadiene), and (MesH)Ir(Bpin)3 (MesH = 
mesitylene) (19), modifying them with a range of 
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nitrogen-based ligands, some of which are summa- 
rized in Table 1. Suitable catalyst and reaction 
conditions were identified systematically by means 
of a high-pressure, high-throughput reactor (see 
fig. S1 for details). Both [Ir(COD)(u-OMe)], and 
(MesH)Ir(Bpin); complexes gave some conver- 
sion to borylated methane products in cyclohexane 
(CyH) or tetrahydrofuran (THF) at pressures 
as low as 2068 kPa. Product yields were determined 
by gas chromatography-mass spectrometry (GC- 
MS) techniques with mesitylene as an internal 
standard. 

Ir(D) precatalysts with supporting ligand com- 
binations were exposed for 16 hours at 120°C 
to 2068 kPa of methane and Bzping. Our re- 
sults indicate that L3 (3,4,7,8-tetramethyl-1,10- 
phenanthroline) is the best nitrogen ligand. Among 
the products detected in the reaction mixture 
were H3CBpin (1), H2C[Bpin]. (2), HBpin (3), 
and H3,COBpin. We also observed the produc- 
tion of O[Bpin],. Because hydrolysis of 1 and 2 
is very slow on the basis of control experiments, 
we propose O[Bpin], to derive from a combina- 
tion of hydrolysis of 3 during aerobic workup and 
analysis by GC-MS, as well as decomposition of 
B,pin, or 3. The decomposition of B,ping may be 
metal-catalyzed, as ring-opening of pinacolborane 
with Ir catalysts has been documented recently 
(20). We did not observe any tri- or tetraborylated 
methyl products, H,,C[Bpin], (@ = 3 or 4), where- 
as borylation of the solvent is barely detected 
under our conditions. Increasing CH, pressures 
in small increments to 8274 kPa did not improve 
the mono- or diborylation reaction appreciably. 
Although gem-diborylation of alkanes is unknown, 
the gem-diborylation of benzylic groups has 
been documented (2/, 22). Because three boryl 
moieties become incorporated into the active cat- 
alyst, as illustrated in Fig. 1, the diboron additive 
has an immediate impact on the reactivity. For 
instance, no reaction takes place when Bocats (cat: 
catechol) is used instead of Byping, which is con- 
sistent with previous experimental and computa- 
tional studies showing that borylation is favored 
for more electron-rich catalysts (23, 24). 

The observed lower yield of 3 compared to 
1and 2 may be due to a second, slower borylation 
cycle that consumes 3 (Eq. 1). Consistent with 
our findings, we have observed that 3 can be 
used as a reactant replacing B.(pin)s., but this 
reaction is much slower at 120°C (table $7). 
Other diboron reagents, such as B.(OH), or 
Bo(NMez2)4, produced complex mixtures of prod- 
ucts with intractable precipitates. 

Table 1 summarizes some of our screening re- 
sults with the most promising chelating polypyridyl 
ligands. The use of ligands L1 and L2 gave detect- 
able amounts of H;CBpin, whereas the best re- 
sults were obtained with L3 (3,4,7,8-tetramethyl-1, 
10-phenanthroline), which showed yields as 
high as 4.1% and chemoselectivity ratios of 
mono- versus diborylated products 1:2 as high 
as 4:1. Surprisingly, even [Ir(COD)(u-OMe)]. with- 
out exogenous ligand resulted in some boryla- 
tion (<1%) but overall (25), the results listed in 
Table 1 suggest these systems to be stoichio- 
metric with respect to methane borylation 
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(supplementary text). Likewise, increasing the 
temperature to 150°C did not improve the 
reaction (table S6). 

The mechanism of methane borylation was mod- 
eled with density functional theory calculations 


on the Ir-phenanthroline system, and the pro- 
posed catalytic cycle is summarized in Fig. 1. Be- 
fore catalysis can take place, the [Ir(COD)(u-X)]. 
(X = OMe or Cl) undergoes a series of ligand 
substitutions to ultimately yield (phen)Ir(Bpin) 


Table 1. 1,10-phenanthroline ligands used in the borylation of methane. The ligands were added 
in a 2:1 ratio relative to dimeric Ir reagent and in a 1:1 ratio relative to independently prepared (MesH) 
Ir(Bpin)3. Solvent was either tetrahydrofuran (THF) or cyclohexane (CyH). Results with other ligands 
are shown in table S4. 


] eon OSL, pon [ESE 
=N N= IN N= aN N= 
L1 L2 L3 
Ir reagent 
CH, + Bopin, Ligand H,C-Bpin + H,C(Bpin), + HBpin + O(Bpin), 
2068 kPa solvent, 16 hours 1 2 3 
I202E 
- MeOBpin 
Entry Ir reagent Loading (mol %) Ligand Solvent Percent yield1 Ratio 
1:2 


[Ir(COD)(1-OMe)]2 25 26 


[Ir(COD)(u-OMe)]> 


[Ir(COD)(u-OMe)]> 
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ee hs ee EM concn eves swan caorccatel UE cnet cae 2 oom salle, 
g) (MesH)Ir(Bpin)3 iL 2Shil 


Fig. 1. Proposed cycle 
for the monoboryla- 
tion of methane with 
1,10-phenanthroline 
as a supporting 
ligand. 
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Fig. 2. Computed structures of catalytic cycle states in Fig. 1. Nonessential hydrogens are omitted 


Table 2. Variations of catalyst loading and time with the ligand dmpe for the borylation of 
methane. The ligand dmpe (MezPCH2CH2PMez) was used in a 2:1 ratio relative to the Ir precatalyst 
[Ir(COD)(u-Cl)]2 in cyclohexane (CyH) under 3447 kPa of CHa. 


[I(COD)(-Cl)], 


CH, + Boypiny dmpe 
3447 kPa CyH, X hours 
IS0R€ 
- C1Bpin 
Entry Loading Time 
(mol %) (hours) 


$0: 00): SI: 0): O1: 8? 


(phen = 1,10-phenanthroline). This complex is the 
most plausible resting state of the catalyst and 
consists of an Ir(III)-d° center in a pseudo- 
square-pyramidal coordination geometry labeled 
as a (see Fig. 2). The catalytic cycle commences 
with weak binding of methane at the empty co- 
ordination site to give the intermediate complex 
b, followed by oxidative addition traversing the 
likely rate-determining transition state b-TS at 
25.9 kcal/mol (26). The iridium center in this inter- 
mediate ¢ adopts a rare, but not unprecedented, 
seven-coordinate geometry (27). Next, the hydride 
and borane ligands swap position to give access to 
c-iso that can undergo reductive elimination of 
the boryl-methane product 1 to afford the Ir(III)- 
complex d, which reacts with another equivalent 
of the diboron source to regenerate the catalyst 
resting state a. We considered several alternative 
mechanisms, most notably a o-bond metathesis 
pathway (28), but found that the mechanism shown 
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H3C-Bpin + H»,C(Bpin), + HBpin + O(Bpin), 


1 2 3 


Percent yield 1 


in Fig. 1 is energetically most favorable. A de- 
tailed analysis of the computational results sug- 
gested a potential optimization strategy: As the 
H-CH; bond is cleaved at the transition state 
b-TS, the Ir-center must undergo formally an 
oxidation from Ir(III) to Ir(V). Therefore, the hard 
N-based Lewis base ligands may not provide the 
ideal supporting ligand framework, as these lig- 
ands tend to decrease the polarizability of the 
valence electrons of the metal. Softer Lewis bases, 
such as the phosphine analogs of the N ligands, 
seemed likely to prove beneficial by increasing 
the polarizability of the metal. 

We tested the simple qualitative rationale 
from our computer model by exploring whether 
phosphine ligands offered improved reactivity 
toward C-B bond formation. Initial screens 
showed that phosphine ligands do not result in 
any notable borylation at 120°C with 2068 to 


3447 kPa of methane, but at 150°C the dmpe 


for clarity. 


ligand (Me,PCH,CH2PMe,) improved the reac- 
tion substantially. Table 2 summarizes the best 
results from our screening. Varying catalyst load- 
ings from 0.5 to 25 mole percent (mol %) led to 
conversion yields as high as 52% and catalytic 
turnover numbers (TONs) up to 104 with selectiv- 
ity of 3:1 for monoborylated product 1 versus 2. 
Increasing the mol % of catalyst resulted in lower 
conversion, though the selectivity for mono- versus 
diborylation (1:2 ratio) of methane increased to 
as high as 9:1 (entry 1). Pressures below 1379 kPa 
afforded lower conversions, whereas pressures 
above 3447 kPa did not greatly improve the over- 
all yield of products. Reactions required 16 hours 
for completion, and control experiments using 
similar amounts of dmpe/[Ir(COD)(u-Cl)]2 and 
1 as a reagent with 40 equivalents of B.(pin). 
(with or without methane) did yield the diboryl- 
ated product 2. This result implies that the yield 
of monoborylation product is always greater than 
for diborylation with the dmpe scaffold. 

An inverse relationship between precatalyst con- 
centration and borylation conversion has previous- 
lybeen observed in borylations with [Ir(COD\(u-Cl)]. 
precatalysts and N-chelating ligands, but no ex- 
planation was provided for this behavior (29). 
Recently, Finke and co-workers have analyzed 
similar counterintuitive behavior in hydrogenations 
with Ziegler-type nanoparticle catalysts prepared 
from Ir precatalysts (30). Likewise, benzene boryl- 
ation has been described with Ir nanoparticles at 
80°C with activities that are considerably lower 
than those for homogeneous catalysts (37). Both 
of these Ir nanoparticle-catalyzed reactions are 
poisoned by Hg. In our case, Hg addition to the re- 
actions listed in Table 2 did not suppress catalysis. In 
addition, borylations with dmpe and phenanthroline- 
based ligands at 150°C with identical precatalyst 
loadings and concentrations give very different 
conversions (table S6). These observations are con- 
sistent with a homogeneous process in which the 
nature of the ligand affects catalysis. Lastly, meth- 
ane activation over Ir/ZrO, has been described, 
but high temperatures (~600°C) are typically re- 
quired for these processes (32). 

Because dmpe/[Ir(COD)(u-Cl)]. afforded the 
cleanest yield of monoborylated product 1, we 
conducted isotopic labeling studies using *CH,- 
enriched methane (99% atom enriched, 1379 kPa) 
to unambiguously establish that methane gas is 
the source of methyl in 1. As anticipated, GC-MS 
results conclusively established the formation of 
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1-C as the only product derived from CH, 
borylation (fig. S7), excluding the possibility of 
pinacol or solvent degradation as possible sources 
of CH3. Mechanistically, we expect the phosphine 
system to follow the same route outlined above 
for the polypyridyl systems. 
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MAGNETOHYDRODYNAMICS 


Large-scale magnetic fields 
at high Reynolds numbers in 
magnetohydrodynamic simulations 


H. Hotta,??* M. Rempel,” T. Yokoyama? 


The 11-year solar magnetic cycle shows a high degree of coherence in spite of the turbulent nature 
of the solar convection zone. It has been found in recent high-resolution magnetohydrodynamics 
simulations that the maintenance of a large-scale coherent magnetic field is difficult with small 
viscosity and magnetic diffusivity (s10” square centimenters per second). We reproduced 
previous findings that indicate a reduction of the energy in the large-scale magnetic field for lower 
diffusivities and demonstrate the recovery of the global-scale magnetic field using unprecedentedly 
high resolution. We found an efficient small-scale dynamo that suppresses small-scale flows, 
which mimics the properties of large diffusivity. As a result, the global-scale magnetic field is 
maintained even in the regime of small diffusivities—that is, large Reynolds numbers. 


he Sun shows an Il-year magnetic activity 

cycle that exhibits a large degree of coher- 

ence. The Sun’s activity has been recorded 

in terms of number of sunspots, whose re- 

cord has a long observation history dating 
from 1610 (1). The coherence of the large-scale 
field is evident from the 11-year polar field rever- 
sals and parity rules of sunspot pairs (2) that show 
only very few violations. The solar magnetic field 
and its cyclic activity is thought to be maintained 
by dynamo action: the transformation of kinetic 
energy to magnetic energy by the turbulent mo- 
tion of the ionized plasma in the solar convection 
zone (3). A remaining mystery is the generation 
process of the coherent large-scale magnetic field 
in the presence of chaotic small-scale fields, which 
are expected because of the large magnetic and 
fluid Reynolds numbers of the solar convection 
zone. Some studies have already succeeded in re- 
producing a magnetic cycle in three-dimensional 
(3D) convection calculations (4-6). Recent calcu- 
lations, however, suggest that large fluid and 
magnetic Reynolds numbers—small viscosity and 
magnetic diffusivity—lead to a reduction of the 
energy and coherence of the global-scale magnet- 
ic field (7). 2.5D kinematic dynamo calculations 
with high magnetic Reynolds numbers suggest 
that the construction of the global-scale mag- 
netic field requires the suppression of the small- 
scale dynamo (8, 9). Although the suppression of 
the small-scale dynamo is caused by the strong 
shear in these investigations, they suggested a 
possibility of the nonlinear Lorentz feedback 
that can suppress the small-scale phenomena 
and cannot be included in these investigations 
because of kinematic assumption—that is, ignor- 
ing the Lorentz force. The global dynamo cal- 
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culations that show the coherent global-scale 
magnetic field (4-6) use relatively large viscos- 
ity and magnetic diffusivity (210 cm? s7 in 
solar cases) or small number of grid points in 
ILES (implicit large-eddy simulation) approaches 
in order to suppress the small-scale chaotic 
magnetic field. However, the sun has very 
small viscosity and magnetic diffusivity (1 and 
10* cm? s"! at the base of the convection zone, 
respectively) (10, 11). Thus, we need to under- 
stand the construction mechanism of the global- 
scale magnetic field in the presence of small 
viscosity and diffusivity. 

A hint is seen in recent high-resolution calcu- 
lations for the small-scale dynamo in the solar 
convection zone (12). For sufficiently high reso- 
lution, the small-scale dynamo becomes efficient, 
and the magnetic energy exceeds the kinetic energy 
on small scales. Then, the Lorentz force feedback 
becomes significant—the kinematic assumption 
is no longer valid, and the small-scale flow is sup- 
pressed. This process requires a high resolution 
to resolve the inertial scale of the turbulence well. 
Because of the substantial scale separation be- 
tween the global scale of the sun (circumference 
is 4.4 x 10° m) and the energy injection scale of 
the turbulence (density scale height is 6 x 107 m 
at the base of the convection zone), resolving an 
efficient small-scale dynamo in global simulations 
requires a large number of grid points and/or 
efficient numerical schemes that resolve small- 
scale turbulence. 

Here, we present high-resolution calculations 
that resolve the turbulence inertial scale well and 
maintain an efficient small-scale dynamo even 
in the global domain. We adopt the reduced 
speed of sound technique (73) and solve the 3D 
magnetohydrodynamics equations in spherical 
geometry (7, 8, 0) with gravity and rotation. The 
solar standard model (Model S) is used for the 
background stratification (14). We have previously 
performed similar calculations, but without rotation 
(15) or magnetic field (16). In addition, the cal- 
culation time is much longer in this study. In 
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order to focus on the interaction of large- and 
small-scale dynamo, we restrict our radial extent 
to 0.715Re < 7 < 0.96Ro, where Ro is the solar 
radius and the full sphere is covered by using the 
Yin-Yang grid (17). Here, we focus on four cases, 
which we name “Low,” “Medium,” “High,” and 
“High-S” in the following discussion. The grid size 
for the cases Low and Medium is 64(7) x 96(8) x 
288(o) x 2(Yin — Yang). For the case Low, we 
adopt a relatively large explicit viscosity and mag- 
netic diffusivity. The values are 10” cm? s™ at 
the top boundary with a dependence of 1/,/po , 
where po is the background density, which is 
same as in the previous study with cyclic mag- 
netic activity (6). For the case Medium, we ex- 
clude explicit viscosity and magnetic diffusivity 
and use only a slope-limited numerical diffusiv- 
ity (78) in order to increase the effective resolu- 
tion. Then, we increase the number of grid points 
by a factor of 4 in the case High, 256(7) x 384(8) x 
1152(o) x 2(Yin — Yang), without explicit viscosity 
and magnetic diffusivity. In the ordinary spheri- 
cal geometry, this number of the grid points cor- 
responds to 256(7r) x 768(8) x 1536(o). Case High-S 
has twice the resolution of case High, with a grid 
size of 512(r) x 768(8) x 2304() x 2(Yin — Yang). 
Because we use the same values for the explicit 
viscosity and magnetic diffusivity in the case Low, 
the magnetic Prandtl number is unity. In addi- 
tion, we use completely the same slope-limited 
diffusion for the velocities and magnetic fields, 
and we expect that the effective magnetic Prandtl 
number would be unity for the cases Medium, 
High, and High-S. This is estimated and con- 
firmed in the supplementary materials, section 2. 
We calculate 50 years for cases Low, Medium, 
and High, and we restrict the calculation time 
for High-S to 500 days. Case High-S is restarted 
from a snapshot of case High at 3000 days. The 
advection time scale at the base of the convec- 
tion zone is estimated as Hp/Vyms ~ 7 days, where 
H, ~ 6 x 10’ m and 2ms ~ 100 m s' are the pres- 
sure scale height and root-mean-square velocity 
at the base of the convection zone, respectively. 
Thus, 500 days correspond to 70 advection time. 

Although the calculation time for High-S is not 
enough to investigate the evolution of the large- 
scale magnetic field, we include this case to fur- 
ther study the dependence of the small-scale 
dynamo on resolution. In all cases, we use the 
solar rotation rate Qo/(2n) = 413 nHz (19) anda 
large thermal conductivity « = 2 x 10% em? s? 
with a radial dependence of 1/,/po (6). Large 
thermal conductivity reduces the convective ve- 
locity and increases the rotational constraint, 
which leads to solar-like differential rotation— 
that is, equatorial acceleration (20, 27). These 
settings are summarized in Table 1. 

Energy flux and differential rotation are shown 
in the supplementary materials (figs. S1 and S2). 
When we increase the resolution, magnetic en- 
ergy increases, and Lorentz force feedback leads 
to a reduction of differential rotation. Regarding 
the energy flux, using our parameters the enthal- 
py (convective) flux transports 60% of solar lumi- 
nosity. This behavior does not depend on the 
resolution. The amplitude of the enthalpy flux 
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possibly changes the location of the large-scale 
dynamo. For example, when the enthalpy flux is 
reduced, the convective velocity becomes small, 
and the rotational constraint becomes strong. 
In our study, the large-scale magnetic field is 
concentrated near the base of the convection zone. 
Strong rotational constraint possibly changes 
this location to the middle of the convection 
zone (22). 

Shown in Fig. 1 and movie S1 are the contour 
of radial velocity v, at 7 = 0.95Ro (Fig. 1, A, C, and 
E) and longitudinal magnetic field B, at r = 
0.72Re (Fig. 1, B, D, and F). Although the overall 
convection patterns among the cases are similar, 


the case High clearly includes the small-scale 
turbulence (Fig. 1E). Similar to the previous 
study (6), the global-scale coherent magnetic 
field is maintained at the base of the convection 
zone in the case Low (Fig. 1B). This is weakened 
with smaller diffusivities in the case Medium 
(Fig. 1D). The large-scale magnetic field seems 
recovered in the higher resolution (Fig. 1F). This 
finding is more clearly seen in butterfly diagram- 
type figures. The zonally averaged toroidal mag- 
netic field (By) at the base of the convection zone 
(r = 0.72Rq) is shown in Fig. 2, where angle 
brackets denote the zonal average. In the case 
Low, coherent magnetic field and its cyclic 


Table 1. List of calculated cases. |n the first three lows, the number of grid points, type of diffusivities, 
and calculation time are shown. The numbers of grid points are in the Yin-Yang grid. Thus, the total 
number of the grid points is N, x Ng x Ng x 2. In the final two lows, the turbulent and mean magnetic 
energy density averaged from 0.715Re to 0.73Re are shown. The averaged period for the cases Low, 
Medium, and High is from 5000 to 12,500 days. The period for the case High-S is from 300 to 500 days. 
The value in parentheses for the case High shows the value from 3300 day to 3500 day, the same 


period for the High-S. The unit for energy is erg per cubic centimeter. 
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Fig. 1. Comparison of the different resolutions. (A, C, and E) The radial velocity at r = 0.95Ro. 
(B, D, and F) The longitudinal magnetic field By at r = 0.72 Ro. The top, middle, and bottom rows show 
the results from the cases Low, Medium, and High, respectively. 
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variation is seen (Fig. 2A). Then, these features 
become weak in the case Medium (Fig. 2B). 
When the resolution is increased further, the co- 
herent magnetic field and its cycle are recovered 
(Fig. 2C). Shown in Table 1 are the turbulent 
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magnetic energy density B’2/(8x), where B’ = 
B -(B), and mean magnetic energy (B)”/(8r) 
averaged from 0.715Ro to 0.73Ro, where the 
mean magnetic field is concentrated. Whereas 
the turbulent magnetic energy (dominant con- 


30 40 50 


Fig. 2. Toroidal magnetic field at the base of the convection zone. (A to C) Zonally averaged toroidal 
magnetic field(By) at r = 0.72Ro. The result from the cases Low, Medium, and High are shown in (A), (B), 


and (C), respectively. 


Fig. 3. Spectra of the cases Medium and High. 
The kinetic (solid) and magnetic (dotted) energy 
spectra at r = 0.72Rq. The red line shows the result 
from the case Medium. The blue and black lines 
show the result from the cases High without mag- 
netic field (hydrodynamic) and High with magnetic 
field, respectively. The averaged period is the same 
as Table 1, from 5000 to 12,500 days. 
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tribution to total magnetic energy) increases 
monotonically with increasing the resolution, 
the mean magnetic energy has a different behavior. 
From Low to Medium, the energy decreases by a 
factor of 3, and half of the mean magnetic energy 
from the case Low is recovered in the case High. 

Our finding can be understood with Fig. 3, 
which shows the energy spectra at the base of the 
convection zone (7 = 0.72Rq). Whereas in case 
Medium, the magnetic energy is always smaller 
than the kinetic energy, indicating an inefficient 
small-scale dynamo, the small-scale (/ > 40, where 
lis the spherical harmonic degree) magnetic en- 
ergy exceeds the kinetic energy in the case High. 
This is an indication of an efficient small-scale 
dynamo. The possibility that the strong small- 
scale magnetic field is generated by the small- 
scale dynamo is supported by fig. S3. From the 
calculation without rotation (only small-scale dy- 
namo action), we find a ratio (Emag/Exin) compa- 
rable with that of the case High. From this, we 
conclude that the small-scale magnetic field in 
the case High is mostly generated by the small- 
scale dynamo. 

The velocity amplitude in the small scale is 
significantly suppressed by the small-scale mag- 
netic field. In the case Medium, the small-scale 
flow leads to destruction of the global magnetic 
field. This is confirmed with an additional calcu- 
lation for the case Medium with the same explicit 
viscosity as the case Low, but no explicit magnet- 
ic diffusivity (fig. S4). That calculation also shows 
a similar level of coherent global-scale magnetic 
field as the case Low. In this control experiment, 
the large viscosity suppresses the small-scale flow, 
which tends to destruct the global-scale magnetic 
field, whereas in the case High, the suppression is 
a consequence of feedback from the strong small- 
scale magnetic field. Previous studies suggested 
that the nonlinearity of the magnetic field can 
suppress the exponential growth of the small- 
scale dynamo and set a finite amplitude of the 
magnetic field, which may be essential in allow- 
ing the reproduction of the large-scale magnetic 
field (8, 9). Because our calculation is nonlinear, 
this effect is included. Our new finding here is 
that the suppression of the small-scale flow sup- 
ports the construction of large-scale magnetic 
field, and small-scale dynamo is still efficient. Be- 
cause the case High-S continues only a short time 
owing to the restriction of our computer resource, 
we cannot conclude that the large-scale magnetic 
field in the case High-S is self-consistently gen- 
erated by the large-scale dynamo. However, we 
can check the numerical convergence and the 
tendency of the small-scale dynamo in the currently 
unreachable high-resolution calculation. The kinetic 
and magnetic energy spectra of the cases High and 
High-S are shown in fig. S5. This figure shows 
that when we adopt higher resolution, the small- 
scale velocity is more reduced; our obtained effect 
is more promoted in higher resolution. 

Our result demonstrates that a global-scale co- 
herent magnetic field can be maintained even with 
small viscosity and magnetic diffusivity, provided 
that the Lorentz-force feedback from a small-scale 
magnetic field is strong enough. We roughly 
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estimate the Reynolds number for the cases High 
and High-S. The values are 2000 and 7000 for 
the cases High and High-S, respectively (the de- 
tailed explanation for the estimation is given in 
the supplementary materials). Thus, the effective 
diffusivities for the cases High and High-S are 
1.6 x 10" and 4.6 x 10'° cm? s“}, respectively. Be- 
cause it was found in (23) that the critical Reynolds 
number for exciting the dynamo is ~100, our 
achieved Reynolds numbers are ~20 and 70 times 
larger than the threshold. Although the presented 
calculation has still a much larger viscosity and 
magnetic diffusivity than those of the real Sun 
(< 10* cm? s”), the obtained mechanism does not 
require any assumption of large “turbulent” diffu- 
sivities because Maxwell stresses from the small- 
scale field can mimic their effect. 
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SOLAR CELLS 


Photon recycling in lead iodide 
perovskite solar cells 
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Lead-halide perovskites have emerged as high-performance photovoltaic materials. We 
mapped the propagation of photogenerated luminescence and charges from a local 
photoexcitation spot in thin films of lead tri-iodide perovskites. We observed light emission 
at distances of 250 micrometers and found that the peak of the internal photon spectrum 
red-shifts from 765 to =800 nanometers. We used a lateral-contact solar cell with selective 
electron- and hole-collecting contacts and observed that charge extraction for 
photoexcitation >50 micrometers away from the contacts arose from repeated recycling 
between photons and electron-hole pairs. Thus, energy transport is not limited by diffusive 
charge transport but can occur over long distances through multiple absorption-diffusion- 
emission events. This process creates high excitation densities within the perovskite layer 


and allows high open-circuit voltages. 


ince 2009 (J), hybrid lead halide perov- 

skite photovoltaics have shown a marked 

rise in power conversion efficiency to val- 

ues that are almost comparable to that of 

crystalline silicon (2-7). This improved pho- 
tovoltaic performance has been attributed to well- 
suited material properties such as high absorption 
cross sections (8), long charge-carrier lifetimes 
(9), and high emission yields (10). Recent studies 
in single crystals have reported charge diffusion 
lengths of 175 um (11, 12); in polycrystalline thin 
films, vertical diffusion lengths have been found 
to be longer than 1 um (23, 14). Together with high 
radiative recombination yields and long carrier 
lifetimes, these properties raise the question of 
whether absorption and reemission of excited 
carriers can occur during the transport. We re- 
port that such “photon recycling” does indeed 
play a central role, allowing considerable increases 
over current descriptions in the characteristic 
lengths for charge and energy transport. 

Highly crystalline inorganic semiconductors 
with high internal quantum yields, such as GaAs, 
demonstrate the current record efficiencies in 
single-junction solar cells (15, 16). The low non- 
radiative recombination rates and high photo- 
luminescence (PL) yields of these materials allow 
one photoexcited state to undergo multiple radi- 
ative emission-absorption events before it is lost 
through nonradiative decay (17, 18). This photon 
recycling effect, together with photonic confine- 
ment caused by the difference in refractive index 
between the active material and its surround- 
ings, leads to a buildup of excited-state popula- 
tion in the bulk of the material, similar to a solar 
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concentration effect (77). Additionally, the length 
scales for energy transport are not limited to a sin- 
gle charge diffusion length but can occur through 
multiple recombination-emission events in an in- 
terchange between light and charge states, which 
markedly enhances the transport length scales. 

Previous studies of lead iodide perovskites 
have shown a sharp absorption onset at the op- 
tical band edge, with an Urbach tail slope close 
to that of GaAs (8, 19), whereas the PL spectrum 
is homogeneously broadened by interaction with 
phonons, leading to a considerable intensity be- 
yond the band edge (20). Additionally, long car- 
rier lifetimes and low nonradiative losses have 
been reported (9, 10). These conditions could 
support photon recycling. 

We studied thin perovskite films (with a thick- 
ness of ~100 nm) on glass substrates [details of 
preparation and characterization can be found in 
the supplementary materials (27)]. Under these 
conditions, only 10 to 15% of internally generated 
PL escaped to the air above or to the glass below 
[calculation in the supplementary materials (27)], 
and the remaining emission was guided within 
the film (22). To measure the spatial distribution 
of photogenerated emission, we used a confocal 
optical microscope setup with separately con- 
trolled excitation and collection objectives and 
a spatial resolution of ~1.5 um (Fig. 1A and fig. 
S1) (21). Photons propagating in the film could 
be scattered out of the film or be absorbed and 
re-emitted isotropically. We measured the emis- 
sion from the edge of the film, maximizing out- 
scattering and allowing the detection of both 
components. These results provide a direct probe 
of the internal photon distribution traveling through 
the film. Figure 1B shows spatial emission map- 
ping. When excitation is near the edge (<4 um), 
the observed spectrum is similar to the macro- 
scopic PL of this film, centered at ~’765 nm (Fig. 
1C). However, when the excitation objective was 
moved farther away from the collection spot, the 
internal spectrum continuously red-shifted to 
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beyond 800 nm after the separation increased to 
50 um. Unexpectedly, at these distances we still 
detected a blue (765-nm) component in the spec- 
trum at a wavelength similar to those of the 
initial emission spectrum, the origin of which 
we discuss below. 

We used photothermal deflection spectros- 
copy to measure the absorption coefficients oy, 
(where A is the wavelength) of the films and then 
compared our findings with the photolumines- 
cence excitation spectrum (Fig. 1C). Under con- 
ditions in which photons are mostly confined 
within a slab (formed here by the glass-perovskite- 
air structure), the Beer-Lambert law gives a decay 
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This relation indicates that the decay is mono- 
exponential for each wavelength, with an addi- 
tional radial factor. Using Eq. 1, the predicted 
spectral decay map following this law is plotted 
in Fig. 1D. The measured decay in the different 
spectral regions was substantially slower than 
the prediction from the Beer-Lambert law. To 
illustrate the difference, the decay for selected 
wavelengths was extracted (Fig. IE) and compared 
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with Beer-Lambert predictions. Beer-Lambert 
predictions do not take scattering into account, 
which further accelerates the decay, particu- 
larly in spectral regions of very low absorption. 

We attribute the main red-shifted peak to 
guided photons that out-scattered at the edge 
of the film. Scattering is nondispersive, so we 
expect these out-scattered photons to match the 
internal spectral distribution of photons travelling 
inside the film. The internal photon spectral dis- 
tribution was biased toward longer-wavelength 
photons that travelled farther between emission 
and absorption events due to the sharp decay of 
the absorption coefficient at the band tail. These 


Fig. 1. Spatial mapping of emission and photon 
spectrum. (A) Graphical representation of micro- 
scope setup and measurement geometry. (B) Exper- 
imentally measured light emission map for different 
separation distances between excitation and collec- 
tion. (a.u., arbitrary units) (©) Comparison of normal- 
ized PL with PL excitation (PLE) and photothermal 
deflection spectra. (D) Predicted spatial light emis- 
sion spectra from the cylindrically decaying Beer- 
Lambert law. (E) Comparison between experiment 
(solid lines) and expected decay (dashed lines) from 
the Beer-Lambert law at 765 and 800 nm. The 
experimental data are not in agreement with sim- 
ple linear absorption, which suggests that addi- 
tional processes, such as photon recycling, maintain 
substantial photon intensity at large distances. 


Fig. 2. Photocurrent mapping of an interdigi- 
tated back-contact (IBC) perovskite solar cell. 
(A) Fabrication process of the IBC device: (left) pat- 
tern a flat sheet of ITO, (middle) electrodeposit TiO2 
on half of the “fingers” and PEDOT on the other 
half, and (right) spin-coat the photoactive perov- 
skite layer. (B) Photocurrent map at the edge of 
the active area of an IBC perovskite device. The 
lateral position is along the electrode direction. We 
observed photocurrent several tens of micrometers 
beyond the last electrode (x-axis position O um, bold 
dashed line). (©) Comparison between normalized 
spatial decay of the photocurrent and square root 
of PL. These results suggest that photon densities, 
which propagate over large distances through the 
material assisted by photon recycling, can be ex- 
tracted as photocurrent. 
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long-wavelength photons were absorbed and 
generated electron-hole (e-h*) pairs far from the 
original excitation spot. When these e-h* pairs 
recombined, they regenerated the original emis- 
sion spectrum that peaks near 765 nm, giving 
rise to the second observed peak. The photon 
energy gain between absorption and re-emission 
occurred via phonon-assisted thermalization. 

With the observed spatial decay of photon 
intensity, charges are expected to be generated 
at comparable distances. To measure the spatial 
charge distribution directly, we designed a lat- 
eral solar cell with electron- and hole-selective 
electrodes. The fabrication process of this back- 
contacted device (Fig. 2A) began with a plain 
sheet of indium tin oxide (ITO) covering a glass 
substrate. Photolithography was used to make 
a pattern of ITO with interdigitated electrodes. 
The channel and electrode width was ~4 «1m (a 
pitch of 8 um). Electrodeposition was used to 
selectively deposit electron- and hole-blocking 
layers. On half of the electrodes, TiO. was de- 
posited from a solution of Ti(O.)SO,, and poly- 
3,4-ethylenedioxythiophene (PEDOT) from a 
3,4-ethylenedioxythiophene monomer-based so- 
lution was deposited in the remaining electrode 
surface. The TiO, layer was formed by hydrolysis, 
whereas the PEDOT film formed by polymeriza- 
tion under an external bias. Finally, a layer of 
perovskite was spin-coated from a standard pre- 
cursor solution based on methylammonium iodide 
mixed with lead acetate in N,.N~-dimethylformamide 
(23), and uniform film formation was detected on 
both electrodes [see supplementary materials and 
methods (27)]. 

The electric response of this device measured 
in the dark revealed a diode-like rectifying be- 
havior. Under solar irradiation, a photovoltaic 
response with an open-circuit voltage of 0.5 V 
was observed (fig. S14), showing effective carrier 
selectivity at the electrodes. Photocurrent ex- 
traction appeared to be limited in our device 
compared with vertical solar cells, possibly due 
to energy barriers at the electrodes and an un- 
favorable charge collection geometry. For com- 
parison, a lateral solar cell without selective 
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layers (fig. S15) (27) showed a reduced charge 
selectivity, as well as very limited voltage and 
photocurrent. 

We used confocal microscopy to map the 
spatially resolved photocurrent generation in 
these devices with an excitation resolution be- 
low 1 um. The photocurrent probed the num- 
ber of photoexcited carriers that reached the 
electrodes. Figure 2B shows a spatial map of 
photocurrent, both across the interdigitated 
electrodes (<1 um) and for photoexcitation be- 
yond the electrodes (0 to 100 um). The very slow 
falloff in photocurrent for excitation beyond 
the edge of the electrode structure (see also Fig. 
2C) extends well beyond the reported diffusion 
lengths of lead halide perovskite thin films. 

In the range of excitation fluences used in 
our experiment, PL in perovskites mainly arises 
from bimolecular recombination of charge car- 
riers with volume density n (i.e., PL &< n), consist- 
ent with e-h* recombination (JO, 24). Hence, we 
can relate the locally generated PL, taken around 
the PL emission peak (765 nm), with the local 
charge density, and /PL < n is a probe of the 
spatial charge distribution. Because PL decays 
cylindrically, it must be geometrically corrected 
by performing a line integral of /PL over the 
length of the electrode, to compare it with the 
photocurrent measurements. 

In Fig. 2C, we compare the decay of the in- 
tegrated \/PL with the measured spatial decay 
of the photocurrent. We observed a similar 
decay in photocurrent and integrated /PL be- 
yond 8 um, which is the resolution set by the 
electrode geometry. The agreement between 
these two quantities indicates that the red- 
shifted component of the recycled photons al- 
lows excitation transport over long distances, 
beyond carrier diffusion lengths, which even- 
tually can be extracted as photocurrent from a 
solar cell. 

To model this, we set up and solved a system 
of cylindrically symmetric partial differential 
equations based on existing theoretical ap- 
proaches for photon recycling (25-27). We ex- 
panded these concepts (27) to account for local 
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Fig. 3. Predicted effects of photon recycling. (A) Change in external and internal PLQE as a function of 
distance derived from a photon recycling diffusion model as presented in the text, with reported mono- and 
bimolecular recombination (Kmono = 10°s 4, kj = 10°? cm? s~) and diffusion constants (D = 0.5 cm? s}) of 
photoexcited carriers. (B) Predicted spatial photocurrent decay for the model with and without photon 


recycling. 
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excitation and calculate the local photon dis- 
tribution in the film 
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The charge-carrier concentration 7 and the pho- 
ton density y were modeled (at different wave- 
lengths 2) as a function of distance from the 
excitation spot. Input parameters are reported 
experimental values for carrier diffusion [diffu- 
sion constant D = 0.5 cm? s! (9, 1, 12)], mono- 
and bimolecular recombination rates of carriers 
[k, = 10° s"' and ky = 107° cm? s; from our own 
measurements and (0, 28)], and the measured 
wavelength-dependent absorption coefficients 
a, (Fig. 1C). (Additionally, ¢ is time, G is the 
generation rate, c is the speed of light, 2, is the 
refractive index, Pstay is the optical probability 
of photon escape, and P, is the probability that 
light will be emitted with a given wavelength.) The 
experimentally measured external bimolecular 
rate had to be adjusted to account for photon 
recycling (29, 30). All absorption of photons is 
assumed to result in the creation of e -h* pairs, 
but only the bimolecular channel is radiative 
(with the spectrum as shown in Fig. 1C), and 
a proportion 1-P.ay [modeled to be 12.5% (21)] 
of these photons is lost through optical trans- 
mission out of the perovskite at the interfaces. 
The external PL quantum efficiency (PLQE) re- 
sults from multiple internal recycling events and 
is related to the internal PLQE by the geometric 
series 


PLQEext = 
So, (PLQEim)" (1 - Pescape)” 'Peseape = 


Fi LQEintP. escape 
Ls PLQE int (1 = Pescape) 


(4) 


The external PLQE varies with distance from the 
excitation spot and the carrier density (Fig. 3A). 
Photon recycling could be very efficient near the 
excitation spot but dropped off at larger dis- 
tances for which charge-carrier densities are 
smaller. From Eq. 4, we find that internal PLQEs 
can exceed 50% for 1-sun illumination (here, sun 
is defined as the solar spectrum standard ASTM 
G173) (internal carrier density ~10"° cem~®), cor- 
responding to lower measured external PLQEs 
of ~10%. The observed recycling effect on extrac- 
tion would be increased in a solar cell with ho- 
mogenous illumination, which produces constant 
carrier densities over the full active area. 

In order for this model to match the experi- 
mental photocurrent, recycling must be taken 
into account to explain the observed long spa- 
tial decays (Fig. 3B). On average, the model predicts 
one recycling event per photoexcited charge carrier 
under 1-sun illumination before the carrier decays 
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nonradiatively [calculation in the supplemen- 
tary materials (27)] in a perovskite film in air on 
glass. This value relates to a photon recycling- 
assisted average excitation travel distance of 20 um 
(fig. S18). The average travel distance could be en- 
hanced at larger charge densities (for example, under 
high fluences) and can reach values beyond 50 tm. 

In terms of e-h* transport, our results sug- 
gest that the average distance a charge carrier 
can travel in a perovskite is not limited by the 
charge-carrier diffusion length, for as long as 
recombination is radiative and the photon stays 
in the film, the e -h* pair can be regenerated and 
can propagate over large distances. This process 
creates a distinction between extraction and charge 
diffusion lengths and allows us to solve the existing 
contradiction of reported high recombination rates 
and long diffusion lengths. 

What are the implications of the observations 
presented here for standard thin-film perovskite 
solar cells (3, 6)? The thin-film samples from our 
work provide valuable model systems for these 
structures. Using the model and parameters 
developed above, we estimate that, under open- 
circuit conditions, in a device with a thickness of 
350 nm and nonquenching electrodes, recycling 
produces a doubling of the internal photon den- 
sity under 1-sun illumination. These effects can 
be enhanced further by minimizing nonradiative 
decay channels and being subjected to higher 
fluences, such as in solar concentrators, where 
high bimolecular recombination rates dominate. 
In the ideal case of unity PLQE and a perfect 
back mirror, photon recycling can produce inter- 
nal photon densities up to 25 suns (4? with n = 
2.5) (31) in perovskite solar cells under open- 
circuit conditions. Photon management, such as 
the use of highly reflective back mirrors to min- 
imize photonic losses and texturing of the top 
surface, offers promising approaches for using 
photon recycling to improve photoconversion 
efficiencies of perovskite solar cells toward the 
Shockley-Queisser limit. Higher photon densities 
lead to higher internal luminescence and a build- 
up of excited charges, which increase the split of 
quasi-Fermi levels and enhance the achievable 
open-circuit voltage in a solar cell. 
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Evaluating replicability of laboratory 
experiments in economics 


Colin F. Camerer,!*+ Anna Dreber,”+ Eskil Forsell,”+ Teck-Hua Ho,®*+ Jiirgen Huber,°+ 
Magnus Johannesson,”} Michael Kirchler,”*°+ Johan Almenberg,’ Adam Altmejd,” 
Taizan Chan,® Emma Heikensten,” Felix Holzmeister,” Taisuke Imai,’ Siri Isaksson,” 
Gideon Nave,! Thomas Pfeiffer,?’° Michael Razen,? Hang Wu* 


The replicability of some scientific findings has recently been called into question. To 
contribute data about replicability in economics, we replicated 18 studies published in the 
American Economic Review and the Quarterly Journal of Economics between 2011 and 2014. 
All of these replications followed predefined analysis plans that were made publicly available 
beforehand, and they all have a statistical power of at least 90% to detect the original 
effect size at the 5% significance level. We found a significant effect in the same direction as 
in the original study for 11 replications (61%); on average, the replicated effect size is 66% of 
the original. The replicability rate varies between 67% and 78% for four additional 
replicability indicators, including a prediction market measure of peer beliefs. 


he deepest trust in scientific knowledge 
comes from the ability to replicate empir- 
ical findings directly and independently. 
Although direct replication is widely ap- 
plauded (J), it is rarely carried out in em- 
pirical social science. Replication is now more 
important than ever, because the quality of re- 
sults has been questioned in many fields, such as 
medicine (2-5), neuroscience (6), and genetics 
(7, 8). In economics, concerns about inflated 
findings in empirical (9) and experimental analy- 
ses (10, 11) have also been raised. In the social 
sciences, psychology has been the most active in 
both self-diagnosing the forces that create “false 
positives” and conducting direct replications 
(12-15). Several high-profile replication failures 


(16, 17) quickly led to changes in journal publica- 
tion practices (78). The recent Reproducibility 
Project: Psychology (RPP) replicated 100 original 
studies published in three top journals in psy- 
chology. The vast majority (97) of the original 
studies reported “positive findings,” but in the 
replications, the RPP only found a significant 
effect in the same direction for 36% of these 
studies (19). 

In this report, we provide insights into the 
replicability of laboratory experiments in econom- 
ics. Our sample consists of all 18 between-subject 
laboratory experimental papers published in the 
American Economic Review and the Quarterly 
Journal of Economics between 2011 and 2014. 
The most important statistically significant finding, 
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as emphasized by the authors of each paper, was 
chosen for replication (see section 1 of the sup- 
plementary materials and tables S1 and S2). We 
used replication sample sizes with at least 90% 
power (mean = 92%; median = 91%) to detect the 
original effect size at the 5% significance level. All 
of the replication and analysis plans were made 
public on the project website (supplementary ma- 
terials, section 1) and were also sent to the original 
authors for verification. 

There are different ways of assessing replica- 
tion, with no universally agreed-upon standard 
of excellence (19-23). We present results for the 
same replication indicators that were used in the 
RPP (19). As our first indicator of replication, we 
used a “significant effect in the same direction as 
in the original study’ [Gelman and Stern (20) 
discuss the challenges of comparing significance 
levels across experiments]. 

The results of the replications are shown in 
Fig. 1A and table S1. We found a significant effect 
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Fig. 1. Replication results. 

(A) Plotted are 95% Cls of replica- 
tion effect sizes (standardized to 
correlation coefficients). The stand- 
ardized effect sizes are normalized 
so that 1 equals the original effect 
size (fig. Sl shows a nonnormalized 
version). Eleven replications have a 
significant effect in the same di- 
rection as in the original study 
[61.1%; 95% Cl = (36.2%, 86.1%)]. 
The 95% Cl of the replication effect 
size includes the original effect size 
for 12 replications [66.7%; 95% 

Cl = (42.5%, 90.8%)]; if we also 
include the study in which the entire 
95% Cl exceeds the original effect 
size, this increases to 13 replications 
[72.2%; 95% Cl = (49.3%, 95.1%)]. 
AER denotes the American Eco- 
nomic Review and QJE denotes the 
Quarterly Journal of Economics. 

(B) Meta-analytic estimates of 


in the same direction as in the original study for 
11 replications (61.1%). This is considerably lower 
than the replication rate of 92% (mean power) 
that would be expected if all original effects were 
true and accurately estimated (one-sample bino- 
mial test, P < 0.001). 

A complementary method for assessing rep- 
licability is to test whether the 95% confidence 
interval (CI) of the replication effect size includes 
the original effect size (19) [Cumming (27) dis- 
cusses the interpretation of CIs for replications]. 
This is the case in 12 of our replications (66.7%). 
If we also include the study in which the entire 
95% CI exceeds the original effect size, the num- 
ber of replicable studies increases to 13 (72.2%). 
An alternative measure, which acknowledges 
sampling error in both the original study and 
the replications, is to count how many replicated 
effects lie in a 95% “prediction interval” (24). 
This count is higher (83.3%) and increases to 
88.9% if we also include the replication whose 
effect size exceeds the upper bound of the pre- 
diction interval (fig. S2 and supplementary ma- 
terials, section 2). 

The mean standardized effect size (correlation 
coefficient, 7) of the replications is 0.279, com- 
pared with 0.474 in the original studies (fig. S3). 
This difference is significant [Wilcoxon signed- 
rank test; z = -2.98, P = 0.003, n = 18]. The rep- 
licated effect sizes tend to be of the same sign 
as the original ones but not as large. The mean 
relative effect size of the replications is 65.9%. 

The original and replication studies can also 
be combined in a meta-analytic estimate of the 
effect size (19). As shown in Fig. 1B, in the meta- 
analysis, 14 studies (77.8%) have a significant 
effect in the same direction as in the original 
study. These results should be interpreted cau- 


tiously, because the estimates assume that the 
results of the original studies do not have pub- 
lication or reporting biases. 

To measure peer beliefs about the replicability 
of original results, we set up prediction markets 
before the 18 replications were performed (25). 
Dreber et al. (26), in a recent study that presented 
evidence for a subset of the replications in the 
RPP, proposed the use of prediction markets as 
an additional replicability indicator. In the pre- 
diction market for a particular target study, peers 
who were likely to be familiar with experimental 
methods in economics could buy or sell shares 
whose monetary value depended on whether the 
target study was replicated (fig. S4 and tables 
S1 and S82). The prediction markets produce a 
collective market probability of replication (27) 
that can be interpreted as a replicability indi- 
cator (26). The traders’ (n = 97) survey beliefs 
about replicability were also collected before 
market trading as an additional measure of peer 
beliefs. 

The average prediction market belief is a rep- 
lication rate of 75.2%, and the average survey be- 
lief is 71.1% (Fig. 2, fig. S5, and tables S3 and S4). 
Both are higher than the observed replication rate 
of 61.1%, but neither difference is significant (sup- 
plementary Materials, section 5). The prediction 
market beliefs and the survey beliefs are highly 
correlated, and both are positively correlated with 
the ranked degree of replication success, although 
the correlation does not reach significance for the 
prediction market beliefs (Fig. 2 and fig. S6). 
Contrary to Dreber et al. (26), prediction market 
beliefs are not a more accurate indicator of 
replicability than survey beliefs. 

We also tested whether replicability is cor- 
related with two observable characteristics of 
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effect sizes, combining the original and replication studies. Plotted are 95% Cls ofcombined effect sizes (standardized to correlation coefficients). The 
standardized effect sizes are normalized as in (A) (fig. Sl shows a nonnormalized version). Fourteen studies have a significant effect in the same direction as 
the original study in the meta-analysis [77.8%; 95% Cl = (56.5%, 99.1%)]. 
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published studies: the P value and the sample 
size (number of participants) of the original study. 
These two characteristics are likely to be corre- 
lated with each other, which is the case for our 
18 studies (Spearman correlation coefficient = 
-0.61, P = 0.007, n = 18). We expected the rep- 
licability to be negatively correlated with the 
original P value and positively correlated with 
the sample size, because the risk of false positives 
increases with the original P value and decreases 
with the original sample size (statistical power) 
(6, 11). The correlations are presented in Fig. 3 
and table S5, and the results are in line with our 
expectations. The correlations are typically around 
0.5 in the expected direction and significant. Only 
one study out of eight with a P value <0.01 in the 
original study was not replicable at the 5% level 
in the original direction. 


Fig. 2. Prediction market 
and survey beliefs. A plot of 
prediction market beliefs 

and survey beliefs, in relation 
to whether the original result 
was replicated with P < 0.05 
in the original direction. The 
mean prediction market belief 
in a successful replication is 
75.2% [range, 59% to 94%; 
95% Cl = (69.7%, 80.6%)], 
and the mean survey belief is 
71.1% [range, 54% to 86%; 050} | 
95% Cl = (66.4%, 75.8%)]. in 
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We report the first systematic replications of 
laboratory experiments in economics, with the 
aim of contributing much-needed data to the 
larger question of the replicability of empirical 
findings in all areas of science. The results pro- 
vide provisional answers to two questions: (i) 
Are laboratory experiments in economics gen- 
erally replicable, and (ii) do statistical mea- 
sures of research quality, including peer beliefs 
about replicability, help predict which studies 
will be replicable? 

The provisional answer to the first question 
is that, based on this sample of experiments, rep- 
lication is generally possible, although there is 
room for improvement. Eleven out of 18 (61.1%) 
studies were replicable with P < 0.05 in the orig- 
inal direction, and three more studies were 
relatively close to being replicated (all have sig- 
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The prediction market beliefs 
and survey beliefs are highly 
correlated (Spearman correla- 
tion coefficient = 0.79, P < 


0.75 1.00 
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0.001, n = 18). Both the prediction market beliefs (Spearman correlation coefficient = 0.30, P = 0.232, 
n = 18) and the survey beliefs (Spearman correlation coefficient 0.52, P = 0.028, n = 18) are positively 
correlated with the ranked degree of replication success. 
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Fig. 3. Correlations between P values and sample sizes in original studies and replicability 
indicators. (A) The original P value is negatively correlated with all six replicability indicators, and five 
of these correlations are significant. (B) The original sample size is positively correlated with all six 
replicability indicators, and five of these correlations are significant. Spearman correlation coefficients 
are shown on the vertical axes. *P < 0.05; **P < 0.01. 
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nificant effects in the meta-analysis). Four rep- 
lications (22.2%) had effect sizes close to zero, 
somewhat more than the 1.4 replication failures 
expected by pure chance (given the mean power 
of 92%). Moreover, the original effect sizes in 
the studies that we replicated could have been 
inflated, a phenomenon that could stem from 
publication bias (28). If there is publication 
bias, our prospective power analyses will have 
overestimated the replication power. 

The answer to the second question is that peer 
surveys and market beliefs did contain some in- 
formation about which experiments were more 
likely to replicate, but sample sizes and P values 
in the original studies were even more strongly 
correlated with replicability (Fig. 3). 

To learn from successes and failures in dif- 
ferent scientific fields, it is useful to compare our 
results with recent results from studies of robust- 
ness in experimental psychology and empirical 
economics. Our results can be compared with the 
recent RPP project in the psychological sciences 
(19), which was also accompanied by prediction 
market beliefs and survey beliefs (26). All mea- 
sures of replication success are somewhat higher 
for the economics experiments than for the sam- 
pled psychology experiments (Fig. 4). Peer beliefs 
in our study are also significantly higher than in 
the RPP study (Fig. 4). Acknowledging the limits 
of this two-study comparison, and particularly our 
small sample of 18 replications, there appears to 
be some difference in replication success between 
these fields. However, it is premature to draw 
strong conclusions about disciplinary differences; 
other methodological factors potentially could 
explain why the replication rates differed. For 
example, in the RPP replications, interaction ef- 
fects were less likely to be replicable than main or 
simple effects (19). 

In economics, several studies have shown 
that statistical findings from nonexperimental 
data are not always easy to replicate (29). Two 
studies of macroeconomic findings, reported 
in the Journal of Money, Credit and Banking in 
1986 and 2006, respectively found that only 13% 
and 23% of original results were replicable, even 
when the data and code were easily accessible 
(30, 31). An analysis of 50,000 P values reported 
between 2005 and 2011 in three widely cited 
general economics journals found that P values 
between 0.10 and 0.25 were less common than 
might be expected. (32). However, the frequency 
of these “missing” P values is smaller in lab- 
oratory and field experiments. Taken together, 
these analyses and our replication sample sug- 
gest that laboratory experiments are at least as 
robust, and perhaps more robust, than other 
kinds of empirical economics. 

Two methodological research practices in lab- 
oratory experimental economics may contribute 
to relatively high replication success. First, exper- 
imental economists have strong norms about 
motivating subjects with substantial financial 
incentives and avoiding the use of deception. 
These norms make subjects more responsive 
and may reduce variability in how experiments 
are performed across different research teams, 
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Fig. 4. A comparison of replicability indicators in experimental economics (this study) and psy- 
chological sciences (RPP). The graph shows means + SE for replicability indicators. All six replicability 
indicators are higher for experimental economics; this difference is significant for three of the replicability 
indicators. The average difference in replicability across the six indicators is 19 percentage points. Details 
about the statistical tests are included in the supplementary materials. *P < 0.05; **P < 0.01. 


thereby improving replicability. Second, pioneer- 
ing experimental economists were eager for others 
to adopt their methods; to this end, they per- 
suaded journals to print instructions and even 
original data. These editorial practices created 
norms of transparency and have made replication 
and reanalysis relatively easy. 

There is every reason to be optimistic that 
science in general, and social science in partic- 
ular, will emerge much improved after the cur- 
rent period of critical self-reflection. Our study 
suggests that laboratory experiments published 
in top economic journals have relatively high 
rates of replicability. Challenges still remain: For 
example, executing replications can be laborious, 
even when scientific journals require online post- 
ing of data and computer code to make things 
easier. This is a reminder that as scientists, we 
should design and document our methods to 
anticipate replication and make it easy to do. 
Our results also show that there is some infor- 
mation in post-publication peer beliefs (revealed 
in both markets and surveys), and perhaps even 
more information in simple statistics from pub- 
lished results, about whether studies are likely 
to be replicable. All of these developments sug- 
gest that the cultivation of good professional 
norms, discouragement of bad norms, policing 
of disclosure requirements by journals, and sim- 
ple evidence-based editorial policies can improve 
scientific replicability, perhaps very quickly. 
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PHYSIOLOGICAL ECOLOGY 


Seasonal and daily climate variation 
have opposite effects on species 
elevational range size 


Wei-Ping Chan,* I-Ching Chen,”?* Robert K. Colwell,®*”> Wei-Chung Liu,® 


Cho-ying Huang,’ Sheng-Feng Shen’+ 


The climatic variability hypothesis posits that the magnitude of climatic variability 
increases with latitude, elevation, or both, and that greater variability selects for 
organisms with broader temperature tolerances, enabling them to be geographically 
widespread. We tested this classical hypothesis for the elevational range sizes of 
more than 16,500 terrestrial vertebrates on 180 montane gradients. In support 

of the hypothesis, mean elevational range size was positively correlated with 

the scope of seasonal temperature variation, whereas elevational range size 

was negatively correlated with daily temperature variation among gradients. 

In accordance with a previous life history model and our extended versions of it, 
our findings indicate that physiological specialization may be favored under 


shorter-term climatic variability. 


hanges in patterns of climatic variability 

with global warming are progressively more 

conspicuous (7). Increasing seasonal varia- 

bility and asymmetric changes of daily max- 

imum and minimum temperatures have 
altered the thermal environment that organisms 
experience (2-4). So far, little is known about 
how species respond physiologically to climate 
variation (5, 6), yet these responses are crucial for 
survival in an era of rapid climate change. The 
climatic variability hypothesis suggests that or- 
ganisms experiencing higher thermal variability, 
and thus having broader physiological thermal 
tolerances, tend to be geographically widely dis- 
tributed as a consequence (7). This hypothesis 
is regarded as a broad macrophysiological prin- 
ciple, as it brings together climate patterns and 
mechanisms of adaptation to explain macro- 
ecological phenomena (8, 9). Although species 
face environmental fluctuations on the scale of 
hours to days to years to decades and beyond, 
how the interplay between climatic variability 
at these various temporal scales contributes to 
shaping the evolution of species’ physiological 
traits and geographical range sizes has rarely 
been addressed. 

Consideration of how species range size relates 
to climatic variation has deep roots (10). Janzen 
(11) explained that “mountain passes are higher 
in the tropics” because species inhabiting tropi- 
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cal mountains experience relatively lower seasonal 
variation in temperature than species at com- 
parable elevations at higher latitudes and may 
therefore evolve narrower physiological toler- 
ances. Temperature gradients in tropical moun- 
tains thus become effective dispersal barriers 
and result in relatively smaller elevational range 
sizes (11, 12). Stevens went on to propose Rapo- 
port’s rule, which postulates a positive correla- 
tion between species range size and latitude or 
elevation, suggesting that climatic variability may 
be the underlying mechanism (13, 14). Empir- 
ical support for these components of the cli- 
mate variability hypothesis has been equivocal 
(15-17), partly due to the use of latitude or elevation 
as a rough proxy for climatic variability (18-27). 
Previous studies often neglected considerable 
variation in climate components within latitudes 
(22), as well as associated distinct biological 
influences. 

Here we assess how climatic variability on con- 
trasting temporal scales—seasonal and diurnal— 
influences the elevational range size of terrestrial 
vertebrates across the world. We obtained data 
for climatic variables potentially associated with 
species range size from CRU TS2.1 and other open 
sources (23) (table S1) and adopted McCain’s 
carefully vetted database of elevational range size 
for 16,592 species of rodents, bats, birds, lizards, 
snakes, salamanders, and frogs on 180 montane 
gradients spanning from 36.5°S to 48.2°N latitude 
(19) (fig. SI). We calculated mean elevational range 
size for each taxonomic group on each gradient. 
These means, not individual ranges, formed the 
basis for all analyses and are henceforth referred 
to simply as “elevational range size.” 

We first applied hierarchical partitioning (24) 
to select the environmental and geographic var- 
iables with the highest explanatory power for 
elevational range size. The nine variables retained 
were daily temperature maximum, diurnal tem- 


perature range (DTR), mean annual temperature, 
seasonal temperature range (STR), minimum and 
maximum monthly mean temperature, mean an- 
nual precipitation (MAP), latitude, and mountain 
height (fig. S2). We then applied structural equa- 
tion modeling (SEM) (25) to assess the relation- 
ships among these variables in explaining range 
size. SEM is capable of including non-mutually 
exclusive hypotheses in a system of relationships 
(25) and, hence, is particularly suitable to struc- 
ture the multiple pathways of highly correlated 
climatic variables that shape elevational range 
size (23) (fig. S3). 

On the basis of the preliminary hierarchical 
partitioning and subsequent SEM analysis, we 
found that latitude alone explained little of the 
variation in elevational range size (Fig. 1, A and 
B), in accord with other studies that used latitude 
as a proxy for climatic variability (15, 17, 19). How- 
ever, when we considered all possible combina- 
tions of proxies, drivers, and relevant climate 
components, the final model retained latitude, 
MAP, STR, and DTR as the best model (Fig. 1, A 
and C). In this model, STR had a significantly 
positive relationship with elevational range size 
for our vertebrate data set (correlation coefficient 
R = 0.29, probability P = 0.006) (Fig. 1A and table 
83). Not surprisingly, latitude had a strong and 
significant positive relationship with STR (R = 
0.88, P < 0.001) (Fig. 1A and fig. S4A) and thus 
indirectly influenced elevational range size through 
its effect on STR in the model. Together, these 
results support the climate variability hypothesis 
and corroborate previous results (17, 12, 19). 

However, elevational range size had a signifi- 
cantly negative relationship with DTR (R = -0.25, 
P = 0.012) (Fig. 1, A and D). Moreover, DTR and 
STR are each negatively correlated with MAP 
(fig. S4, D and E; R = -0.54, P < 0.001 and R = -0.07, 
P = 0.025, respectively; panels A, B, and C in fig. 
S4 display the global patterns between each cli- 
matic factor and latitude). In contrast, MAP showed 
only a weak correlation with elevational range size 
itself (Fig. 1E), as demonstrated previously by 
McCain (19). Our final model fits better than a 
model with only latitude and STR [root mean 
square error of approximation = 0.076; compar- 
ative fit index = 0.981; standard root mean square 
residual = 0.073 (table S2); note that SEM pen- 
alizes for each additional parameter]. When we 
used climate variables for which climate data are 
currently available at finer spatial resolutions 
(5 are min and 30 arc sec) (fig. S5), the structured 
relationships remained robust, except that the 
effect of STR became insignificant in one model 
variant. 

In our analysis, latitude and MAP emerged as 
the geographical and environmental factors that 
indirectly shape elevational range size through 
their influence on climatic variability (DTR and 
STR). We used a stationary bootstrap method to 
assess whether STR and/or DTR is more explanatory 
than expected at random along latitude and MAP 
gradients (23). We found that MAP gradients, but 
not latitude, influenced the relative importance of 
DTR versus STR with regard to the elevational 
range size (Fig. 2, A and B). The explanatory 
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power of MAP was generally higher than ran- 
dom expectation along the precipitation gradient, 
whereas latitude showed considerably less devi- 
ation from random expectations. Because precip- 
itation influences global energy flow through its 
correlation with cloudiness and latent heat flux, 
MAP has been identified as a dominant factor 
governing Earth’s thermodynamics (26). At low- 
er precipitation levels, DTR was the dominant 
influence on geographic variation in mean eleva- 
tional range size, whereas STR dominated at 
moderate precipitation levels. At high precipi- 
tation levels, the effects of both DTR and STR 
were diminished (Fig. 2A). This complex relation- 
ship was generally concealed when the proxy 
approach was directly applied. As shown by the 
blue lines in Fig. 2, A and B, the locally weighted 
scatterplot smoothing (LOESS) lines for eleva- 
tional range size did not respond noticeably to 
either gradient. 

Our structural equation model demonstrated 
that STR and DTR have opposite effects on spe- 
cies elevational range size. Although organisms 
must evolve to survive all conditions that they 
experience (tolerance range), they can nonetheless 
focus reproductive activity on a narrow range of 
conditions (optimum performance range), as long 
as they experience those conditions often enough 
within their life span (27). Using a phenotypic 
optimality model, Gilchrist (27) demonstrated that 
greater among-generation temperature variation 
should favor a wider performance range (thermal 
generalists), whereas a narrower performance 
range (thermal specialists) will be favored by se- 
lection when within-generation temperature var- 
iation is great. Recent empirical studies also show 
that the scope of tolerance range limits for motor 
function and survival, as determined experimen- 
tally, may be a poor predictor of elevational range 
size for thermal generalists (28). 

Nevertheless, because Gilchrist’s model focused 
on within- and among-generation environmental 
variation, an organism’s life span should have a 
pronounced influence on the evolution of thermal 
performance range. Thus, it is perhaps surprising 
to see the strong relationships among STR, DTR, 
and range size for the vertebrate species in our 
analysis, given that most have multiyear genera- 
tion times. We therefore extended Gilchrist’s ap- 
proach to general forms of environmental variation 
to investigate the expected effects of longer- and 
shorter-term environmental variations on the ex- 
pected evolution of performance range (23) (figs. 
S6 to S9). We found that Gilchrist’s principal 
predictions still hold, even when we replaced 
among- and within-generation variations with 
amore general form of longer- and shorter-term 
variation, respectively. This result arises simply 
because longer-term variation (including STR) 
occurs more frequently among generations than 
within generations, whereas shorter-term varia- 
tion (e.g., DTR) tends to occur within genera- 
tions (23). Moreover, we found that average STR 
was highly correlated with multiyear temper- 
ature variation (R = 0.87, P < 0.001) (23) (fig. S10). 
Together, these results help to explain the im- 
portant roles of STR and DTR in shaping the 
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Fig. 1. Relationships among MAP, DTR, latitude, and STR in explaining the elevational range sizes 
of terrestrial vertebrates. (A) Structural equation statistical model. N, number of mountain gradients; 
RMSEA, root mean square error of approximation; SRMR, standard root mean square residual; CFI, com- 
parative fit index. (B) Direct relationship between elevational range size and latitude. The blue line rep- 
resents the LOESS mean; the red dashed line represents a significant linear relationship. (C) Conceptual 
scheme of this study. Plus and minus symbols represent positive and negative relationships, respectively. 
(D) Partial residual plots of elevational range size and DTR. The red line represents the regression curve, 
which controls for the effect of STR and the interaction between DTR and STR. The gray shaded area 
represents the smoothed 95% confidence interval. (E) Direct relationship between elevational range size 
and MAP. The blue line represents the LOESS mean. In (A), the structural equation model, numbers 
next to arrows and boxes are unstandardized slopes and intercepts, respectively. The double-headed 
arrow indicates correlations between factors. For this analysis, taxonomic differences were statistically 
controlled by setting taxon as a variable, but taxa were also analyzed separately (fig. S11). For details, 
see tables S3 to S5. 


elevational range sizes of the vertebrate species 
in this study. 

In addition, taxon-specific analysis showed 
that MAP and DTR synergistically shape ele- 
vational range sizes of rodents and birds (but 
not bats, the third endotherm group consid- 
ered), with increasing range size associated with 
greater MAP (fig. S11). For endotherms, water 
availability is crucial for evaporative cooling in 


a hot environment (29). The role of water in 
adaptation to cold remains largely unexplored 
in ecological studies, but water may be impor- 
tant in blood circulation and metabolic heat 
(30). Further studies of the relationship between 
water availability and shorter-term temperature 
variation could prove fruitful, especially for en- 
dotherms, including bats (see supplementary text 
and fig. S12). 
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Fig. 2. Influence of DTR and STR along environmental gradients. Panels show the relative explanatory 
power of DTR and STR for elevational range size along (A) the MAP gradient and (B) the latitudinal 
gradient. In the upper panels, blue lines represent LOESS lines of the plots in Fig. 1, B and E. Total ex- 
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Fig. 3. Global maps of temperature variability. (A) Seasonal temperature range (STR). (B) Diurnal 
temperature range (DTR). (©) Mean annual precipitation (MAP). (D) RGB (red-green-blue) color 
spectra presenting STR, DTR, and MAP. For example, the northern Amazon basin within the tropical 
region has very high MAP with low STR and DTR, yielding bluish pixels in (D). All maps are at 0.5° spatial 


resolution. 


On the basis of our empirical and modeling 
results, we propose a new macroecological prin- 
ciple. Introducing temporal scale offers a new 
perspective on the physical influence of climatic 
variability. STR dominates the thermal profile at 
high latitudes in the Northern Hemisphere, where- 
as tropical areas with high amounts of rainfall 
weaken the contrast between DTR and STR (Fig. 3). 
DTR dominates the majority of the rest of the 
land surface, including arid land masses, moun- 
tainous areas, and most of the terrestrial South- 
ern Hemisphere (Fig. 3D). We conclude that the 
relevance of each climatic factor to the range 
size of species should be carefully evaluated for 
organisms of different taxonomic groups, char- 
acterized by different generation times and ther- 
moregulatory systems. 
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Our study may have implications for under- 
standing biological responses to climate change. 
For example, tropical species are expected to be 
thermal specialists because they are adapted to 
low STR (5, 6). Nevertheless, because of their 
adaptation to higher DTR, both tropical and tem- 
perate montane species (of some groups) may 


be 


thermal specialists and, thus, vulnerable to 


changing climates. 
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MEMORY FORMATION 


Diversity in neural firing dynamics 
supports both rigid and learned 
hippocampal sequences 


Andres D. Grosmark’” and Gyérgy Buzsaki?’?* 


Cell assembly sequences during learning are “replayed” during hippocampal ripples and 
contribute to the consolidation of episodic memories. However, neuronal sequences 

may also reflect preexisting dynamics. We report that sequences of place-cell firing in a 
novel environment are formed from a combination of the contributions of a rigid, 
predominantly fast-firing subset of pyramidal neurons with low spatial specificity and 
limited change across sleep-experience-sleep and a slow-firing plastic subset. Slow-firing 
cells, rather than fast-firing cells, gained high place specificity during exploration, elevated 
their association with ripples, and showed increased bursting and temporal coactivation 
during postexperience sleep. Thus, slow- and fast-firing neurons, although forming a 
continuous distribution, have different coding and plastic properties. 


he restructuring of hippocampal networks 

through synaptic plasticity is necessary for 

the formation of new episodic memories. 

Replay of hippocampal place-cell (7) se- 

quences during sharp wave ripples (SPW- 
Rs) of waking immobility (2-5) and non-rapid eye 
movement sleep (6-13) after learning has been 
proposed to support memory consolidation (0-73). 
Replay is conceptualized and typically studied as 
a phenomenon with higher-order interactions 
within populations of neurons taken to have sim- 
ilar properties (10, 14). However, networks built 
from similar neurons are unstable (15), and recent 
findings demonstrate that biophysical properties 
of cortical pyramidal neurons are highly diverse 
and characterized by lognormal distributions of 
synaptic weights, long-term firing rates, and spike 
bursts (16). Furthermore, temporal correlations of 
hippocampal neurons are largely preserved across 
brain states and environmental situations, sug- 
gesting that learning-induced changes are con- 
strained within a dynamically stable network 
(16, 17). An example of a preexisting bias between 
place-cell sequences in a novel environment and 
sleep before the novel experience (preplay) has 
been described (18-20), although its computation- 
al relevance has been questioned recently (14). To 
clarify the relationship between preexisting bio- 
physical properties of neurons and their contri- 
bution to learning, characterization of individual 
neurons is necessary. We performed such analy- 
ses during sleep in rats before and after they ex- 
plored a novel environment. 

Simultaneous recordings of well-isolated CA1 
pyramidal single units were performed in four 
rats. Several methods were used to assess the 
relationship between firing patterns during explo- 
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ration of one of two linear or a circular track 
(MAZE) in rooms B, C and D, respectively, and 
candidate SPW-R sequences during preexperience 
sleep (PRE) and postexperience sleep (POST) in 
the home cage in room A (fig. S1) (22). First, a 
spatial Bayesian decoder (2), constructed from 
the firing-rate vectors of place cells (7 = 491 cells) 
during track running (eight novel exploration ses- 
sions), was applied to all candidate ripple events 
(20) (figs. S2 to S4) of PRE, MAZE, and POST im- 
mobility epochs to estimate the posterior proba- 
bilities of position in forward (2, 5, 9) or reverse 
virtual traversals of the track (3, 4) (Fig. 1, A and B). 
These virtual traversals were measured as weighted 
correlations over the Bayesian derived posteriors 
for place across all 20-ms bins in each ripple event 
(21) and normalized as Z scores [rZ (Sequence 
score) (27)] (figs. S5 to S8). 

To determine each place cell’s contribution to 
PRE and POST sequences [per cell contribution 
(PCC)], a PCC score was defined as the neuron’s 
mean contribution across all PRE and POST pu- 
tative events as determined by a cell-specific 
shuffling technique (27). Neurons that showed 
significant PCC scores in either PRE or POST 
were considered to be strongly contributing to se- 
quence formation (7 = 216 neurons (Fig. 2B) (27). 
Importantly, the majority of neurons strongly con- 
tributing to PRE (73%) also contributed to POST 
sequences (Fig. 2C and fig. S19), suggesting that 
these neurons represent rigid network elements. 
Strongly contributing neurons < 50 percentile of 
that session’s PRE to POST sleep PCC change 
(APCC) were classified as rigid cells (blue x’s in 
Fig. 2B), whereas those > 50 percentile as plastic 
cells (red x’s in Fig. 2A and figs. S9 to 11). Rigid 
and plastic neurons were similarly distributed 
along the track (fig. $2). The contribution of in- 
dividual neurons to the overall population APOST- 
PRE score (POST rZ - PRE rZ) was assessed by 
either shuffling or excluding neurons with increas- 
ing or decreasing APCC scores from the analysis 
(Fig. 2C and fig. $12) (27). Replay (POST > PRE) 


was eliminated after shuffling or removal of the 
top 10 to 20% of cells with the highest APCC scores, 
whereas it remained after shuffling or removal of 
the bottom 75% (Figs. 2, C and D, and fig. S12). 
These results suggest that from the PRE to the 
POST sleep, plastic neurons are added to a pre- 
existing backbone structure, leading to an increase 
in maze-related sequential activity (replay) asso- 
ciated with learning (Fig. 2C) 

Although both rigid and plastic neurons con- 
tributed to replay sequences, the nature of their 
representation was different. Plastic neurons had 
higher place-specific indices and fewer place fields 
compared with rigid cells (Fig. 3A), and plastic but 
not rigid neurons increased their spatial specific- 
ity steadily during learning (Fig. 3B and fig. S13). 
Neither overall firing-rate changes (figs. S14 and 
$15) nor potential errors in neuronal clustering or 
neuron classification (figs. S2 and S11) could ac- 
count for the above differences. These findings 
suggest that precision in spatial coding is a prop- 
erty developed during maze running by a small 
plastic subset of cells. 

Next, we asked whether rigid and plastic neu- 
rons have different biophysical and network prop- 
erties. Session-wide firing rates of rigid neurons 
were significantly higher compared with plastic 
neurons (Fig. 4, A and B, and figs. S14 and S15). 
Ripple-related spiking, bursting, and pair-wise 
coactivation were higher in plastic versus rigid 
neurons (Fig. 4A and figs. S14 to S16). In PRE 
sleep, coactivity was dominated by fast-firing neu- 
rons, and these correlations remained unchanged 
into POST sleep. In contrast, slow-firing neurons 
showed the strongest increases in coactivation 
from PRE to POST sleep (Fig. 4B). Pair-wise co- 
activity and temporal bias patterns were stable 
from PRE to POST for pairs of rigid cells, whereas 
plastic cell pair interactions were modified by ex- 
perience on the novel maze (J0) (figs. S16 and S17). 

Because the above analyses indicated that over- 
all firing rates and ripple-related activity of neu- 
rons predicted their coding and plastic features, in 
our final analysis we divided place cells into equal 
subgroups by either “off-line” sleep-firing rate or 
ripple-rate gain and repeated the Bayesian place 
decoding analysis for each group. Low-rate neu- 
rons (median, 0.39 Hz) contributed more spatial 
information per spike on the maze and displayed 
increased within-ripple firing-rate gains from PRE 
to POST sleep than high-rate cells (median, 1.12 Hz) 
(fig. S16). Slow-firing, but not fast-firing, neurons 
increased their contribution to neuronal sequences 
from PRE to POST sleep (Fig. 4D). Conversely, PRE 
to POST increases in sequence content were lim- 
ited to cells that showed a high degree of ripple- 
specific recruitment (gain) (Fig. 4E and figs. S20 
to $22), suggesting that ripples are privileged 
windows for learning-related changes in excitability. 
Similar results were obtained using several Bayesian 
and non-Bayesian replay methods (figs. S19 to S22). 

Using several established and newly developed 
methods, we demonstrate that sequences of place 
cells in a novel environment are formed from a 
combination of relatively fast-firing group of py- 
ramidal neurons with relatively unchanging tem- 
poral dynamics and a slow-firing plastic subset of 
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Fig. 1. PRE and POST maze sequence events. (A) Simultaneous recording of 77 place cells (rightward runs) used to generate a sequence template. (B) Rep- 
resentative forward and reverse sequences during PRE maze sleep, immobility in the novel MAZE, and POST-learning sleep. (C) Cumulative distribution of rZ for 
PRE, MAZE, and POST events, and 95% confidence intervals. Vertical dashed lines, medians. Inset, mean + SE of sequence scores in each condition. *P < 0.05; 
**P < 0.005; ***P < 0.0005 (Kruskal-Wallis test, followed by post hoc Tukey-Kramer tests). Sign-rank tests were used for within-condition significance testing. 
(D) As inset in (C), but events with forward and reverse sequences are shown separately (within epoch comparisons, ranked-sum test). 
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POSTsequence content. Gray line, least-squared regression between all PRE and POST PCC scores. Neurons strongly contributing to either PRE or POST (21) are 
marked with X; others are marked with dots. Strongly contributing neurons in the lower 50th percentile of that session’s PRE versus POST change (APCC) were 
considered rigid cells and those in the upper 50th percentile as plastic cells. (©) Raster and local field potential (LFP) plots of example ripple events from the PRE, 
MAZE, and POST epochs (diamonds show within-ripple spike-time center of mass). These six events correspond to the top row of Fig. 1B. Rigid and plastic cell 
spikes are shown in blue and red, respectively. Although rigid cells tend to predominate in the PRE epoch, the marked increase in sequence content observed in the 
MAZE and POST epochs is driven by the recruitment of plastic cells. (D) To assess the contribution of neurons with differing APCC scores to the change in virtual 
travel content from the PRE to the POST epoch, the replay analysis was repeated using templates in which an increasing percentage (x axis) of neuron’s place 
fields were shuffled either beginning with those that showed the lowest APCC values (blue line) (shaded area shows bootstrapped 95% confidence interval) or 
beginning with neurons with the highest APCC scores (red line). (E) Effect on sequence content of removal of rigid (black) or plastic (gray) neurons. The PRE to 
POST increase in sequence content is attributable to only a small number of plastic cells. 
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related plasticity. (A) Summary of excitability and synchrony profiles of rigid 
and plastic cells. Each panel shows the cumulative distributions of the two 
groups; dashed lines show medians (P values, rank-sum tests). (B) To ex- 
amine the relationship between firing rates and learning-related changes in 
pair-wise coactivation (Pearson's correlation of firing rates in 100-ms bins), 
neurons in PRE, MAZE, and POST were sorted by their overall session (SESS) 
firing rates [(B), left panel]. (©) Coactivation was assessed across overlapping 
groups each containing 20% of place cells with similar firing rates (step size, 
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1% of cells). Although fast-firing cells dominated the coactivation structure 
during the PRE (left panel), it was the slow-firing cells that showed the highest 
increase in coactivation from PRE to POST (middle two panels). Moreover, 
it was the slow-firing cells that showed the greatest replay (partial cor- 
relation across coactivation values between RUN and POST, accounting for 
PRE). (D and E) An additional replay analysis restricted to place cells with 
either low or high firing rates (D) or within-ripple firing-rate gains [(E), 
annotation same as in Fig. 2D] confirms the findings obtained using the PCC 
method. 
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neurons (22). Firing properties of neurons pre- 
dicted their rigid (18, 19) and plastic (0) features. 
Slow-firing neurons gained high place specificity 
during maze exploration (23, 24) and showed in- 
creased ripple-related recruitment during POST- 
experience sleep. In contrast, fast-firing neurons 
had low selectivity (16, 25), have been shown pre- 
viously to project to multiple targets (26) and to 
form an interactive subnetwork responsible for 
global stability, thus allowing plasticity to take 
place in the remaining majority of slow-firing cells 
(16, 17). Fast-firing neurons may generalize across 
situations, whereas slow-firing neurons may differ- 
entiate among them (27). Because replay sequence- 
forming neurons are drawn from the wide span of 
a continuous log-rate distribution (16) with vary- 
ing coding, biophysical, circuit, and plasticity prop- 
erties, these events can forward a synthesis of 
preexisting and new information to downstream 
observer neurons. 
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Sequential transcriptional waves 
direct the differentiation of newborn 
neurons in the mouse neocortex 


Ludovic Telley,’’* Subashika Govindan,””* Julien Prados,” 
Isabelle Stevant,”” Serge Nef,”’” Emmanouil Dermitzakis,””’®” 


Alexandre Dayer,”*” Denis Jabaudon”*’’+ 


During corticogenesis, excitatory neurons are born from progenitors located in the 
ventricular zone (VZ), from where they migrate to assemble into circuits. How neuronal 
identity is dynamically specified upon progenitor division is unknown. Here, we study 
this process using a high-temporal-resolution technology allowing fluorescent tagging 
of isochronic cohorts of newborn VZ cells. By combining this in vivo approach with 
single-cell transcriptomics in mice, we identify and functionally characterize neuron- 
specific primordial transcriptional programs as they dynamically unfold. Our results 
reveal early transcriptional waves that instruct the sequence and pace of neuronal 
differentiation events, guiding newborn neurons toward their final fate, and contribute 
to a road map for the reverse engineering of specific classes of cortical neurons from 


undifferentiated cells. 


uring neocortical development, distinct clas- 

ses of neurons assemble to form local and 

long-range circuits. Although class-specific 

genes and features identify cortical neuron 

types relatively late in differentiation (1-5), 
early postmitotic fate specification programs 
have been inaccessible. Here, we describe the 
dynamic transcriptional activity controlling layer 
4 (LA) excitatory neuron birth and differentia- 
tion in the mouse neocortex. 

Mammalian cortical progenitor cells in the 
ventricular zone (VZ) undergo DNA synthesis [S- 
phase, susceptible to bromodeoxyuridine (BrdU) 
labeling] at the basal border of the VZ and mito- 
sis (M-phase, lasting about an hour at midcorti- 
cogenesis in mice) when their soma is apically 
located, adjacent to the ventricular space (6, 7). 
At this location, mitotic cells are susceptible to 
labeling by intraventricular injection of carboxy- 
fluorescein esters [“FlashTag” (FT)], which bind 
to and fluorescently label intracellular proteins 
(8). The short extracellular half-life of FT in 
the mouse ventricular space ensures effective 
pulse-labeling of juxtaventricular dividing cells 
(Fig. 1A and fig. Sl). Intracellularly, FT is lin- 
early diluted at each mitosis, such that fluores- 
cence reflects the number of cell divisions that 
have occurred since the time of labeling (fig. S1, 
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D and E, and movie S1) (8). FI* newborn cells 
synchronously moved away from the ventricular 
wall within 3 hours of labeling (Fig. 1A, bottom), 
reached the subventricular zone (SVZ) within 
12 hours, and entered the cortical plate (CP) 24 
to 48 hours after mitosis (Fig. 1B). Isochronic 
cohorts of VZ cells born at the time of injection 
can thus be specifically identified and tracked 
during their initial differentiation. 

The laminar fate of FT* neurons was linked to 
the day of FT injection at all ages examined [em- 
bryonic day (E) 11.5 to 17.5] (fig. S2 and Fig. 1C). 
At postnatal day (P) 7, when neuronal migration 
is complete, E14.5-labeled FT* neurons were re- 
stricted to a sublamina of L4 (Fig. 1C). These 
neurons were born at the time of the FT pulse, 
not later, because they mostly remained un- 
labeled after continuous BrdU administration be- 
ginning at the time of the FT pulse (fig. S1, B to 
D). Injection of FT at E14 and E14.5 using two 
dye colors in the same embryo showed two dis- 
tinct populations of labeled neurons within L4 
at P7, revealing a tight relationship between time 
of birth and final radial location, even within a 
single layer (Fig. 1D). Thus, we used E14.5 FT in- 
jections to label L4 neurons in vivo from the 
time of mitosis in the VZ and track their early 
molecular differentiation. 

We observed that newborn cells sequential- 
ly expressed PAX6, a VZ marker, TBR2 a SVZ 
marker, and the early neuronal protein TBR1 
(9, 10) within the first 48 hours after mitosis 
(fig. S3). This reveals a highly dynamic cellular 
process characterized by overlapping signa- 
ture shifts in protein expression. For an unbiased 
account of the transcriptional programs active 
just after cell birth in single cells, we isolated 
E14.5-born FT” cells 6, 12, 24, and 48 hours 
after mitosis by using cortical microdissection 
followed by fluorescence-activated cell sorting 
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Fig. 1. FT labels time-locked cohorts of newborn VZ cells during corticogenesis. (A) (Top) Schematic representation of the FT labeling principle. (Bottom) 
Pulse-labeling of isochronic mitotic cells using FT at E14.5. PH3, phospho-histone 3, an M-phase marker. (B) Isochronic cohorts of FT* cells radially migrate 
from the VZ to the CP. PAX6 and TBR2 delineate the VZ and SVZ. (C) E14.5 FT labeling identifies a subset of L4 neurons at P7. (D) E14 (FTy*) and E14.5 (FTg*) 
VZ-born neurons occupy distinct sublaminae within L4. Cx, cortex; IZ, intermediate zone. See also figs. Sl and S2. 
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(FACS). We characterized transcriptional activ- 
ity using single-cell RNA sequencing in micro- 
fluidically isolated single cells (fig. S4, A and B) 
C, U1, 12). 

To determine the sequence and pace of early 
differentiation processes, we first examined the 
expression dynamics of a core set of genes in- 
volved in proliferation, neurogenesis (i.e., which 
promote differentiative divisions), and neuronal 
differentiation. We found that proliferative (P), 
neurogenic (Ng), and neuronal (N) transcripts 
were sequentially expressed: All P transcripts 
were repressed first, Ng transcripts showed de- 
layed repression, and N transcripts were induced 
after cell division (see fig. S4C and table S1). The 
closely timed repression of P and Ng transcripts 
reveals that exit from the cell cycle and initial 
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Fig. 2. Identification of newborn cortical neurons. (A to C) Unbiased clustering 
delineates neurons from progenitors. (A) Apical progenitors, daughter basal pro- 


e-’ genitors, and newborn neurons can be distinguished by unbiased clustering (left), 
iN iemaee temporal distribution (top right), hierarchical clustering (bottom right), and [(B) and 

Pare cl’! (C)] expression of specific markers and Prype: N&type. ANd Niype transcripts. (B, bottom 
7 cs ~~ right) Schematic in (B) provides examples of type-specific genes presented in 

Nes Neaph Wwtr figs. S6 to SY. Although basal progenitors eventually give rise to neurons (dotted 


arrow), this progeny is not included in the current data set because FT* cells are 
essentially VZ-born (see fig. Sl). (D) Spatial segregation of progenitor and neuron-specific transcripts with in situ hybridization (24). Values represent median 
expressions for several transcripts. (E) Rapid segregation of cell-type-specific transcripts after cytokinesis. P < 0.0001 for all values compared to 6-hour apical 


postmitotic specification are partially overlapping 
rather than strictly sequential processes. We used 
these program-specific dynamics to identify a 
broader set of proliferative-type (Ptype), neurogenic- 
type (Ngiype), and neuronal-type (Niype) transcripts 
(fig. S4D and data table S1). The functional rel- 
evance of these three programs was supported 
by the enrichment of Piype, N&iype, aNd Niype tran- 
scripts in the VZ, SVZ, and CP, respectively; dif- 
ferential enrichment in specific gene ontology 
terms; and sequential expression in single cells 
(fig. S4, D and E, and fig. S5). These findings 
reveal the highly dynamic unfolding of prolifer- 
ative, neurogenic, and neuronal programs after 
mitosis in vivo. 

Two main classes of juxtaventricular cells 
are initially labeled by FT in the VZ: (i) pro- 


genitor cells and (ii) newborn neurons (Fig. 1A 
and fig. SID). We sought to identify neuron- 
specific transcriptional programs by distinguish- 
ing neurons from progenitors. For this purpose, 
we used a machine-learning approach to clus- 
ter cells based on transcriptional expression 
signatures (13). 

This approach delineated distinct groups 
of cells, which were identified as progenitors 
[genuine apical progenitors and daughter basal 
progenitors (14)] and neurons (early and late 
populations) (Fig. 2A). Apical progenitors and 
daughter basal progenitors were distinguished 
based on differential expression of markers such 
as Eomes and Btg2 [which are enriched in ba- 
sal progenitors (see references in table S1)] (Fig. 
2B) and differential enrichment in Piype genes 
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Fig. 3. Real-time functional transcriptomics of early postmitotic neurons in vivo. (A) Neurons are staggered 
by age along the pseudotime axis. (B) Gene expression dynamics for classical proliferative (Sox2), neurogenic 
(Neurog2), and neuronal (Tbr1) genes. Neurod2 is expressed more strongly and earlier than Tbr1. QR code, http:// 
genebrowser.unige.ch/science2016, for access to dynamics of all transcripts. (C) Unbiased clustering of genes based on 
expression dynamics reveals distinct transcriptional waves with sequential expression peaks (black arrowheads). 
Illustrative transcription factors are provided for each wave (see also fig. S10 and data table S3). (Right) Chromatin 
immunoprecipitation sequencing—identified targets of NEUROD2 (18) are enriched in its own wave but also are present 
across waves (see also fig. S11). (D) Summary of wave dynamics related to developmental time. (E) Gene ontology term-— 
based analysis. Colors correspond to wave numbers. (F) Double-strand DNA breaks are transiently increased in 
12-hour-old cells, as indicated by the presence of phosphorylated histone 2AX (yH2AX) (25). **P < 0.001. 


A nmiei45 Nen 1s? Bi ctis Nts? GFP Ki67 C cts rn? GFP ND 
= a #24h ag 
cP oi z Z 
= Nrn1 8 8 
. 3 SVZ 8 
IZ 4 a 2 
a & Ss 
SVZ a 3g 3 
f = zg 
VZ m™ «° vz 
_— a oe ae 0 ) 
50m 02468 1 
Pseudotime 
Fig. 4. Early expression of the late-wave gene Nrn1 induces premature neuro- [  ctsFP Nn 1SFP 
nal differentiation. (A to D) Premature expression of Nrni (A) leads to a forward 
shift in neuronal differentiation by inducing cell-cycle exit (decreased number of @ Ctl 4 
Ki67* progenitors) (B) and premature neuronal differentiation (increased number = a 
of ND2* neurons) (C). This effect occurs within 12 hours of birth, as assessed > 
within an isochronic 12-hour-old cohort of FT* cells (D). (A, left) In situ hybrid- RD == * 
ization (24). (E) NRN1-overexpressing neurons undergo premature migrational 
arrest before reaching L4. *P < 0.05; **P < 0.001; ***P < 0.0001. ND2, g = * 
NEUROD2. ° 
Ss ee ee 


° 


100 um 


80 


(apical > basal progenitors), includ- 
ing Nes and Sowx2 (Fig. 2C). Accord- 
ingly, cells in the apical progenitor 
cluster were mostly 6 hours old, 
whereas newborn basal progen- 
itor identity was more distinct after 
12 hours (Fig. 2A, top right). Neu- 
rons expressed core neuronal genes 
such as Tori] and Mapt, and were 
enriched in Niype genes (Fig. 2, B 
and C). With apical progenitors and 
their daughter neurons now dis- 
tinguishable, we identified cell-type 
specific, stage-specific transcripts 
by comparing gene expression at each 
developmental age (data table S2 
and fig. S8). Consistent with this 
classification, apical progenitor genes 
were predominantly expressed in 
the VZ, basal progenitor genes ex- 
tended into the SVZ, and neuronal 
transcripts showed stage-specific se- 
quential expression in the VZ, SVZ, 
and CP (Fig. 2D and fig. S9). Hie- 
rarchical relationship analysis re- 
vealed that apical progenitors are 
clearly distinct from daughter basal 
progenitors and neurons (Fig. 2A, 
bottom right), further supporting 
the lineage relationships identified 
above. Segregation of type-specific 
transcripts between newborn neurons 
and their progenitors was detected 
as early as 6 hours after mitosis (Fig. 
2E). This suggests that type-specific 
transcripts can be premitotically 
segregated or differentially stabilized 
in nascent postmitotic neurons ver- 
sus progenitors. Together, these data 
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identify progenitor and neuron-specific transcripts 
activated after cell division and reveal rapid cell- 
type specific segregation and regulation of tran- 
scripts after mitosis. 

To establish a real-time quantitative account 
of differentiation programs in newborn neurons, 
we used an unsupervised approach in which single- 
cell expression profiles are temporally ordered 
based on distinct intermediate differentiation 
states (Fig. 3A) (15, 16). This method appro- 
priately ordered neurons along a pseudotime 
axis, with 6-, 12-, 24-, and 48-hour-old neurons 
being progressively staggered along this time 
line (Fig. 3A). This allowed us to reconstruct the 
expression dynamics of all transcripts across this 
pseudotime axis and generate a high-resolution 
transcriptomic atlas of the first 48 hours of 
L4 cortical neuron development (Fig. 3B) (see 
http://genebrowser.unige.ch/science2016 for 
the data set of all transcripts). The expression 
dynamics of classical P (Sox2), Ng (Neurog2), 
and N (TbrI) transcripts were consistent with their 
function (Fig. 3B). Neurod2 was identified as 
an early-onset neuronal transcript; accordingly, 
NEUROD?2 protein was detected in newborn 
apical VZ neurons, whereas this was not the case 
for TBR1 (Fig. 3B, inset). 

Clustering of expressed transcripts based 
on their expression dynamics showed how tran- 
scriptional networks are organized in newborn 
neurons. Directly after mitosis, waves of tran- 
scriptional programs sequentially unfold, each 
including temporally distinct complements of 
transcription factors and networks (Fig. 3, C 
and D, fig. S10, data table S3, and movie S2). To 
understand the temporal organization of the 
molecular pathways across differentiation, we 
focused on the genetic network of Neurod2, a 
wave 5 transcription factor required for L4 
neuron maturation and whose target genes have 
been identified in the E14.5 neocortex (17, 18). 
The temporal distribution of NEUROD2 target 
genes across the distinct waves was not ran- 
dom (Fig. 3C, right, and data table S4). Instead, 
NEUROD2 targets were strongly enriched in 
its own wave (e.g., Nrni and EphB2), in line with 
its role in neuritogenesis, but also present across 
waves, including in wave 1, where targets in- 
clude cyclins and cyclin-dependent kinases such 
as Cend2, Cend3, and Cdk13, which control cell 
cycle progression. NEUROD2 may therefore 
act not only on isochronically expressed genes 
but also across differentiation. Consistent with 
a repressive action on wave 1 targets, overex- 
pression of NEUROD2 through in utero electro- 
poration into VZ progenitors induced exit from 
the cell cycle, as indicated by decreased numbers 
of Ki67* VZ cells (fig. S11). Single transcription 
factors can therefore control distinct differenti- 
ation events through combinatorial actions on a 
variety of temporally gated genetic targets and 
networks. 

Ontology term analysis of the transcriptional 
waves identified successive functional differen- 
tiation events in newborn neurons (Fig. 3E). We 
observed an initial rapid (6 to 12 hours after 
mitosis) repression of proliferation-associated 
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transcripts (e.g., Arv, Notch1, and Sox9) and a 
surge in transcripts associated with ribosome 
biogenesis and translation (e.g., Fi/1, Rpl13a, and 
Rpli2), which might reflect nucleolar reassem- 
bly and increased protein synthesis. Transcripts 
associated with DNA repair (e.g., DNA2, Ddb1, and 
Exol) were transiently increased after mitosis, 
suggesting postmitotic genetic instability. Con- 
sistent with this possibility, DNA double-strand 
breaks were significantly increased in 12-hour- 
old neurons (Fig. 3F). This reveals a critical period 
after mitosis during which neocortical neurons 
are susceptible to somatic mutations and where 
clonal mosaicism could be generated (19, 20). 
Twelve-hour-old neurons already initiated dif- 
ferentiation programs related to late-occurring 
processes such as synaptogenesis, revealing an 
early transcriptional poise in anticipation of ter- 
minal differentiation events. Finally, chemotaxis- 
associated transcripts (e.g., Ephbi, LICAM, and 
Nrpl) peaked around 42 hours after birth, while 
neurons are reaching the CP, providing a mo- 
lecular framework for input-dependent differen- 
tiation processes (27). 

Finally, we examined whether the distribution 
of transcript expression across waves instructs 
the sequence and pace of neuronal differenti- 
ation events. For this purpose, we prematurely 
expressed a late-wave transcript, Nrni, which 
normally peaks ~30 hours after mitosis (wave 5, 
Fig. 4A) and controls L4 neuron maturation 
through promotion of neuritogenesis (22, 23). 
We hypothesized that heterochronic expres- 
sion of this normally late-occurring gene could 
bypass early processes and accelerate neuronal 
differentiation. Indeed, in utero electropora- 
tion of Nrni led to premature transition to 
neuronal identity and precocious expression of 
NEUROD2 (Fig. 4, B and C). Premature acqui- 
sition of this neuronal trait was detectable as 
early as 12 hours after cell birth, as revealed 
by assessing NEUROD2 expression within an 
isochronic 12-hour-old cohort of FT* cells with 
mosaic overexpression of NRNI (Fig. 4D). Final- 
ly, precocious molecular maturation was asso- 
ciated with an early loss of migrational capacity, 
leading to neuronal mispositioning at birth (Fig. 
4E). Therefore, the precise timing of early dif- 
ferentiation programs is critical not only for 
the execution of single-cell differentiation events 
but also for the successful organization of the 
cortical networks to which it belongs. Precise and 
dynamic temporal control over the expression 
of even single genes thus controls the sequence 
and pace of neuronal differentiation, which is es- 
sential for circuit assembly. 

Our data provide a comprehensive transcrip- 
tional blueprint outlining the dynamic acqui- 
sition of neuronal identity in vivo. We show 
that early neuronal differentiation is directed by 
a series of transcriptional waves whose proper 
sequence is critical for normal progression through 
development. These waves provide discrete time 
windows during which specific transcriptional 
complexes are present simultaneously and can 
interact. These transient combinatorial transcrip- 
tional niches could act as sequential checkpoints 


during the course of differentiation, combinato- 
rially coding for specific cell fates. These results 
build a road map for reverse engineering of 
cortical neuron subtypes from undifferentiated 
cells and provide a set of genetic targets for 
identification and directed differentiation of 
progenitors and nascent neurons. 
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CLOUD FORMATION 


An interfacial mechanism for cloud 
droplet formation on organic aerosols 


Christopher R. Ruehl,*{} James F. Davies, Kevin R. Wilsont 


Accurate predictions of aerosol/cloud interactions require simple, physically accurate 
parameterizations of the cloud condensation nuclei (CCN) activity of aerosols. Current 
models assume that organic aerosol species contribute to CCN activity by lowering water 
activity. We measured droplet diameters at the point of CCN activation for particles 
composed of dicarboxylic acids or secondary organic aerosol and ammonium sulfate. 
Droplet activation diameters were 40 to 60% larger than predicted if the organic was 
assumed to be dissolved within the bulk droplet, suggesting that a new mechanism is 
needed to explain cloud droplet formation. A compressed film model explains how surface 
tension depression by interfacial organic molecules can alter the relationship between 
water vapor supersaturation and droplet size (i.e., the Kohler curve), leading to the larger 


diameters observed at activation. 


ccurate predictions of the impact of aero- 
sols on cloud properties, and thus the ra- 
diative balance of the atmosphere, rely on 
simple parameterizations of cloud droplet 
formation. Despite the simplicity required 
of such parameterizations, they must be based 
on robust chemistry and physics to ensure the 
validity of climate predictions. From Kohler the- 
ory (J), the cloud condensation nuclei (CCN) ac- 
tivity of an aerosol is governed both by its size 
and its molecular constituents that can lower the 
water activity and/or surface tension of aqueous 
droplets below that of pure water. There are a 
number of extensively used empirical parame- 
terizations of Kohler theory that neglect surface 
tension depression and assume that cloud drop- 
lets form on particles that contain sufficient solute 
(2, 3). In our study, droplet sizes measured up to 
and including the point of cloud droplet activation 
reveal that most organic aerosol (OA) contributes 
to CCN by adsorbing to the air/droplet interface. 
One popular parameterization of Kohler the- 
ory, known as «-Kohler theory (3), describes the 
lowering of water activity by a solute using a 
single parameter, «: a dimensionless ratio of the 
molar volume of water to the average osmolar 
volume of the aerosol. x-K6hler theory is used to 
interpret field observations and laboratory ex- 
periments in an effort to relate aerosol compo- 
sition to hygroscopicity (4). Although many field 
studies, including those in Colorado (5), Ontario 
(6), coastal California (7), and the Amazon (8), 
yield reasonable values for the hygroscopicity of 
the organic component of the aerosol (ko,,) when 
interpreted via «-Kohler theory, there are obser- 
vations that suggest more complex behavior (non- 
ideal solution and surface tension effects) not 
captured in current water activity-based param- 
eterizations of CCN activity (9, 10). 
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Neglecting surface activity in CCN param- 
eterizations appears consistent with predic- 
tions of surface-bulk partitioning models (e.g., 
Szyszkowski-Langmuir adsorption theory). When 
using parameters obtained for macroscopic so- 
lutions, the bulk concentrations of surface-active 
solutes are predicted to be strongly depleted in 
microscopic droplets, thus increasing water ac- 
tivity and negating any increase in CCN activity 
caused by a reduction in surface tension (JJ). 
This has led many to conclude that accounting 
for surface activity is not necessary to accurately 
predict the CCN activity of OA, despite mea- 
surements of reduced surface tension in mac- 
roscopic aqueous solutions of atmospheric OA 
and relevant model compounds (12, 13). Further- 
more, partitioning models fail to predict concen- 
trations of surfactants in submicrometer droplets, 
suggesting significant limitations of current ap- 
proaches to accurately describe surface activity 
at that length scale (74). 

Using a continuous-flow streamwise thermal 
gradient chamber [section S1 of (15)], we mea- 
sured the diameter of droplets (Dye) that form 
on mixed organic-ammonium sulfate (AS) par- 
ticles at water vapor supersaturation [S, or relative 
humidity (RH) - 100%] approaching and includ- 
ing the point of CCN activation. Although it is 
common to compute Kohler curves to predict the 
critical supersaturation (S,) of aerosols, direct 
measurements of Dy, aS a function of S (15) al- 
low OA CCN activity to be additionally constrained 
by droplet activation diameter (Dyer). 

As shown in Fig. 1A, pure 200-nm-diameter 
(Dary) AS particles activate at Dyer. ~ 2.5 um, 
consistent with «-Kohler theory. Mixed aerosols 
of AS and sucrose (known to be surface-inactive) 
activate at Dye. = 0.8 um, consistent with a 
water activity-based parameterization (Fig. 1B). 
These results are in contrast to AS aerosol coated 
with a series of small dicarboxylic acid (Fig. 1, C 
and D, and figs. S5 to S7) and a-pinene secondary 
OA (SOA, Fig. 2). The functional form of Dyet 
versus S deviates substantially from «-K6hler pre- 
dictions, with a Dyet- that is ~50% larger than 


predicted by constant «oy, for a given S,. For suc- 
cinic acid-coated AS (Fig. 1D), Dwet is observed 
to be 1.8 um, which is substantially larger than 
predicted (Dwet< = 14 um), assuming Koy = 0.31. 
A similar difference was observed for malonic 
acid (Fig. 1C). For SOA-coated AS in Fig. 2, Dywete = 
1.3 to 1.4 um, which is much larger than the ko. 
predictions of Det. = 0.9 um. 

For pure AS aerosol, the evolution of Dyer 
with S, and the size of Dwet,, are exactly what is 
predicted by «x-Kohler theory and serve to val- 
idate the experimental approach. The mixed 
AS/sucrose observations show that «-K6hler the- 
ory correctly predicts D,,., versus S and Dwet< 
in the case of a highly water-soluble organic 
compound that does not depress surface tension 
(i.e., is surface-inactive). In contrast, AS aerosols 
coated with a series of dicarboxylic acids, some 
of which are known to be surface-active, exhibit 
much larger activation diameters and a differ- 
ent functional form of Dy; versus S than is pre- 
dicted by «-Kohler theory (Figs. 1 and 2 and figs. 
83 to $7). Although limited organic solubility can 
alter the shape of the Kohler curve (16), it cannot 
explain the consistently large droplet activation 
diameters observed for these dicarboxylic acids, 
whose bulk solubility varies from near that of 
sucrose (i.e., malonic acid) to over two orders of 
magnitude smaller (table S1). The mixed SOA/AS 
aerosols exhibit the same deviation from «-K6hler 
theory as the dicarboxylic acids. Collectively, the 
observed differences between Dyer for the or- 
ganic acids (also SOA) and sucrose suggest that 
the discrepancies with «-Kohler theory originate 
from surface effects rather than non-ideal behav- 
ior of mixtures. 

To account for the surface activity of organ- 
ics within Kohler theory, an equation of state is 
required to relate the bulk and surface concen- 
trations of the various solution components. 
Associated with the equation of state is an iso- 
therm that relates organic surface concentration 
to surface tension. Previously, the Szyszkowsky- 
Langmuir equation (13, 17) was used to compute 
Kohler curves (i.e., to predict S.) for model or- 
ganic compounds (16). This particular treatment 
of bulk-surface partitioning [section S2 of (75)] 
cannot fully explain the observed Dy. as a func- 
tion of S. Instead, an equation of state that allows 
for a two-dimensional (2D) phase transition is 
required. To provide a self-consistent model de- 
scription of the observations shown in Figs. 1 
and 2, a compressed film model (78) was used to 
describe the relationship between surface ten- 
sion depression and organic surface coverage or 
thickness on the droplet (figs. S3 to S7). Model 
details can be found in section S2 of (15) and are 
described conceptually here. The model con- 
tains a 2D phase transition between “gaseous” 
and “compressed” surface states, which depends 
on surface concentration (i.e., molecular pack- 
ing). Because the quantity of organic material 
is fixed by the composition of the original dry 
aerosol, changes in surface tension occur when 
S increases and the droplet grows, decreasing 
the surface concentration by providing a larger 
surface area per molecule. At low S (below S,), 
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the surface organic concentration is high, and 
the molecules adopt a compressed state, which 
lowers the droplet’s surface tension below that 
of liquid water. At higher S (near S,), the drop- 
lets are larger, the surface concentration is lower, 
and the molecules at the interface are non- 
interacting (i.e., in a gaseous surface state), with 
a droplet surface tension nearly equal to that of 
pure water. The data shown in Figs. 1 and 2 (and 
figs. S3 to S7) are best replicated if it is assumed 
that the surface tension increases linearly with 
decreasing surface concentration (18). As shown 
in Figs. 1 and 2 and figs. S3 to S7, the compressed 
film model can reasonably account for the func- 
tional form of S versus Dye. Although the com- 
pressed film model shows that some of the 
organic material is dissolved in the droplet bulk, 
the model reveals that for SOA and most of the 
model compounds [section S3 of (15)] most of the 
organic material is at the droplet surface (Fig. 2B 
and figs. S3 to $7), in stark contrast with the un- 
derlying assumption of «-Kohler theory. 

The compressed film model predicts that a 
particle will reach S, when the surface film de- 
creases in thickness to the point that individ- 
ual molecules begin to separate; i.e., a 2D phase 
transition occurs, and the surface tension no 
longer varies with increasing D,,., (15). This 
occurs at an S and Dy; that lies on the post- 
activation portion of the Kohler curve for the 
bare inorganic AS seed. At this intersection point, 
all of the hygroscopicity can be attributed to the 
reduction of water activity by AS. The compressed 
film model explains this by predicting that the 
organic material is adsorbed to the interface 
(hence no decrease in water activity by the or- 
ganic), but Dyer is large enough to lower the or- 
ganic surface concentration to a point where 
the resulting surface tension is near that of pure 
water. Although there are several parameters 
needed to constrain the model, a single-parameter 
approximation to the compressed film model 
can be used if it is assumed that all the organic 
material resides at the droplet surface. This pa- 
rameter, termed dorg, corresponds to a film thick- 
ness (in nanometers) on the droplet surface where 
the 2D phase transition occurs (i.e., where the 
surface tension depression goes to zero). Although 
surface concentration is used more often than 
film thickness for monolayers of known compo- 
sition, film thickness is preferable for discussions 
of particle hygroscopicity, because aerosol com- 
position is often complex and poorly constrained 
molecularly, and because sizes (particle and drop- 
let diameters) are measured in CCN experiments. 

The S, for a particle is computed from doy if 
both the organic fraction (forg) and Kinorg (inor- 
ganic hygroscopicity) are known [section S2 of 
(15)], which is the same set of parameters required 
to predict S, given koyg. In Fig. 3, the surface (8o,z) 
and bulk activity («,,,.) parameterizations are 
compared with additional measurements of S, 
versus dry organic coating thickness on AS seed 
particles for the same series of dicarboxylic acids 
and SOA. The measured critical supersaturation 
(S,) decreases with increasing dry organic coat- 
ing thickness for all particles measured. The data 


1448 25 MARCH 2016 « VOL 351 ISSUE 6280 


for each compound/SOA in Fig. 3 are fit with 
both a constant organic osmolar volume (related 
tO Koyg) and film thickness (6,;¢). 

In most cases, constant Ko;~ does not repli- 
cate the observed curvature in S, versus coated 
diameter (Fig. 3), generally predicting a more 
shallow slope than is observed. There are two 
exceptions: malonic and suberic acid. However, 
the molar volumes required to replicate these 
data are much smaller than what is reported in 
the literature: malonic acid by 36%, and suberic 
acid by 3.7 times. Such a dramatic increase in 
hygroscopicity cannot be explained by the dis- 
sociation of these weak acids. In contrast, the 
Sorg approximation does capture the curvature 
observed in most plots of S, versus fore shown 
in Fig. 3. Thus, the 6,,, approximation to the 
compressed film model correctly accounts for 


~~" Korg 


A ammonium 
sulfate (AS) 


—— AS water activity 


© «=06 


D Korg = 0.39 
> 8org = 0.13 nm 


12 1.6 2.0 2.4 


Dwet (UM) 


— Film 


both the droplet size at activation (Figs. 1 and 2 
and figs. S3 to S7) and the evolution of S, with 
dry OA fraction (Fig. 3). 

The compressed film model offers an expla- 
nation for some recent ambient CCN observa- 
tions. In CCN closure studies, agreement between 
observations and predictions is often best when 
the OA fraction is assumed to be insoluble, as 
recently reported in California (79). In particles 
that have similar amounts of organic and inor- 
ganic material, or are predominantly inorganic, 
there is not enough organic material to form a 
compressed film on the droplet at or near the 
point of activation. Although this organic mate- 
rial will be adsorbed to the droplet surface, there 
is an insufficient concentration at the interface 
to reduce surface tension; instead, the organic 
will effectively behave as insoluble material with 
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Fig. 1. Kohler curve observations for (A) pure AS, (B) sucrose + AS, (C) malonic acid + AS, and 
(D) succinic acid + AS particles. D = 200 nm for pure AS particles, and for all other particles D = 
150 nm (a 50-nm AS seed + a 50-nm radial coating thickness, corresponding to 89% organic by volume). 
As described in section S1 of (15), Dwet is measured by phase Doppler interferometry, along the cen- 
terline of a thermal gradient chamber (29) after ~10 s of exposure to RH ~ 100% and is therefore not 
sensitive to decreases in surface tension that might occur over longer time scales as recently observed 
for aerosol and biological surfactants (14). Solid and open circles represent unactivated and activated 
droplets, respectively. Because it is not always apparent when activation occurs solely from measure- 
ments of Dwet, S- (horizontal solid black lines) is measured using a separate CCN counter (CCNC, 
Droplet Measurement Technologies). Dashed red lines are the Kohler curves predicted using a water 
activity parameterization (Korg). Solid blue lines are those predicted by the compressed film model 
[section S2 of (15)], and dashed black lines are those for the AS seed particles. The solid red line in (A) 
is the Kohler curve obtained with a parameterization of the water activity of dilute AS (30). Also shown 
are the points of CCN activation predicted by korg (red diamonds) and dog (blue diamonds). 
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respect to CCN activation. Recently, activation 
diameters inferred from particle and droplet size 
distributions in urban fog were unexpectedly 
large (20), similar to what was observed in this 
study. Finally, observations of enhanced OA hy- 
groscopicity that were recently reported for a 
coastal site on Vancouver Island (9) do not seem 


Fig. 2. K6hler curve 
observations for AS + 
a-pinene SOA particles 
(D = 175 nm) prepared 
in a flow tube reactor. 
As described in section S1 
of (15), SOA was gener- 
ated by ozonolysis in a 
flow tube reactor and 
coated onto 85-nm- 
diameter AS seeds 

91% SOA by volume). 
A) Dashed red lines 

are the Kohler curves 
predicted using a water 
activity parameterization; 
solid blue lines are those 
predicted with the 
compressed film model 
[details in section S2 of 
15)]; and dotted black 08 
ines are those for the 

AS seed particles. 


reasonable for marine organic material, suggest- 
ing, in light of our results, that a major contribu- 
tion to CCN activity from surface activity could 
be likely. 

The compressed film model also offers an ex- 
planation for some unresolved questions arising 
from laboratory CCN studies, including several 
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The horizontal black line indicates the S, observed with a conventional CCN instrument (CCNC, Droplet 
Measurement Technologies). Also indicated is the point of CCN activation predicted by korg (red) and 
Sorg (blue) symbols. Included are the (B) surface tension (0) and (C) fraction of SOA at the surface 
(fsurt), aS predicted by the compressed film model. 
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Fig. 3. S, as a function of coated 
dry diameter for (A to F) a series 
of dicarboxylic acids or (G) 

SOA generated via ozonolysis of 
a-pinene coated onto AS seed 
particles (D = 35 nm). Both seed 
and coated diameters were size- 
selected by a differential mobility 
analyzer (TSI, Model 3080). 
Dashed and solid lines are the best 
fits for each substance, using a 0.6 
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single-component OAs that exhibit anomalously 
large CCN activity despite limited bulk solubility 
(21-24). For example, pimelic acid is CCN-active 
(« = 0.14 to 0.16), despite its low solubility that 
suggests that « should be five times smaller than 
observed (25). This model predicts that most 
OA (especially OA of limited solubility) is ad- 
sorbed to the surface of microscopic droplets. 
Thus, the relatively small bulk concentrations 
may not, in fact, exceed solubility limits. Although 
Raoult (water activity) effects may also be impor- 
tant for aerosol with unknown or more complex 
composition (26), it is an unlikely explanation 
for the anomalously high CCN activity of indi- 
vidual compounds. The compressed film model 
also resolves the “gap” between korg values de- 
rived from CCN activity experiments, which are 
often much larger than those derived from sub- 
saturated measurements of hygroscopic growth 
(27). The surface tension reduction by a com- 
pressed film will only increase hygroscopicity at 
high RH (near 100%) (28), whereas at lower RH, 
where the water activity term dominates, the 
effect of the film will be to lower hygroscopicity 
relative to the fully soluble assumption. 

These results point to an alternative mecha- 
nism for cloud droplet formation in mixed organic/ 
inorganic aerosol. Although a water activity- 
based parameterization correctly predicts the 
droplets’ sizes under supersaturated conditions 
for pure AS and sucrose (a non-surface-active 
compound), it fails to correctly account for the 
larger cloud droplets that form on AS coated 
with a series of dicarboxylic acids (with a broad 
range of water solubility) and SOA. Thus it is 
unlikely that our results can be attributed to bulk 
solubility effects, but rather can be explained if 
most of the organic material exists as an inter- 
facial compressed film, which reduces surface 
tension, allowing larger droplets to form before 
activation. At activation, the compressed film 
transitions to a gaseous state, and the surface 
tension of the droplet is nearly equal to that of 
pure water. Although the assumption that or- 
ganic material dissolved in the droplet bulk may 
yield reasonable CCN predictions, in field mea- 
surements and in the laboratory these results 
can help explain several outstanding questions 
and highlight the potential importance of inter- 
facial organics in the formation of cloud droplets 
on OA. 
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Survey of variation in human 
transcription factors reveals 
prevalent DNA binding changes 


Luis A. Barrera,””’*** Anastasia Vedenko,'* Jesse V. Kurland,’* Julia M. Rogers,” 
Stephen S. Gisselbrecht,’ Elizabeth J. Rossin,” Jaie Woodard,”” Luca Mariani,’ 
Kian Hong Kock,” Sachi Inukai,' Trevor Siggers,'{ Leila Shokri,’ Raluca Gordan, "+ 
Nidhi Sahni,**°§ Chris Cotsapas,”’*|| Tong Hao,®®° Song Yi,®®’° Manolis Kellis,*”® 
Mark J. Daly,” Mare Vidal,®®!° David E. Hill,®?!° Martha L. Bulyk??*?»®78-12q4 


Sequencing of exomes and genomes has revealed abundant genetic variation affecting the 
coding sequences of human transcription factors (TFs), but the consequences of such 
variation remain largely unexplored. We developed a computational, structure-based 
approach to evaluate TF variants for their impact on DNA binding activity and used 
universal protein-binding microarrays to assay sequence-specific DNA binding activity 
across 41 reference and 117 variant alleles found in individuals of diverse ancestries and 
families with Mendelian diseases. We found 77 variants in 28 genes that affect DNA binding 
affinity or specificity and identified thousands of rare alleles likely to alter the DNA 
binding activity of human sequence-specific TFs. Our results suggest that most 
individuals have unique repertoires of TF DNA binding activities, which may contribute 


to phenotypic variation. 


xome sequencing studies have identified 

many nonsynonymous single-nucleotide 

polymorphisms (nsSNPs) in transcription 

factors (TFs) (7). Genetic variants that alter 

transcript expression levels have been as- 
sociated with human disease risk and are wide- 
spread in human populations (2, 3). Numerous 
Mendelian diseases are attributable to mutations 
in TFs (4). Missense SNPs that change the amino 
acid sequence of TF DNA binding domains (DBDs) 
might disrupt their DNA binding activities and 
thus have detrimental effects on their gene reg- 
ulatory functions. Despite their medical impor- 
tance, the consequences of coding variation in 
DBDs for TF function have remained largely 
unexplored. 

We identified 53,384 unique DBD polymor- 
phisms (DBDPs) (table S1) (here, defined as mis- 
sense variants) in a curated, high-confidence set 
of 1254 sequence-specific human TFs (5, 6) (table 
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$2) from genotype data for 64,706 individuals 
encompassing African, Asian, and European an- 
cestries (Fig. 1A) (J, 2, 7). We also identified 4552 
unique nonsense mutations that result in partial 
or full DBD truncation (table $3). 

We found a median of 60 heterozygous and 20 
homozygous DBDPs (Fig. 1B) per genome. We 
found a significant depletion (odds ratio = 3.7, 
P = 0.005, Fisher’s exact test) of DBDPs among 
TFs with known Mendelian disease mutations 
(6, 8), suggesting that DBDPs in disease-associated 
TFs have phenotypic consequences. 

We developed a computational approach (6) to 
evaluate missense substitutions in TF DBDs for 
their impact on DNA binding activity. Existing 
methods for predicting the impact of missense 
mutations (9, 10) do not adequately consider the 
roles of residues in protein-DNA interactions, 
which we reasoned should improve predictions. 
We first focused on homeodomain DBDs, as 


most known Mendelian disease mutations in 
TFs occur in homeodomain proteins. We analyzed 
homeodomain-DNA cocrystal structures in the Pro- 
tein Data Bank to assemble a composite protein- 
DNA “contact map” (fig. S1). As anticipated, residues 
that contact DNA bases or phosphate backbone, 
or that immediately neighbor base-contacting res- 
idues, are enriched among Mendelian disease mu- 
tations (P < 0.005, permutation test). In contrast, 
individuals in the population are depleted for 
variants at base- or backbone-contacting positions 
(P = 0.0134 or 0.0312, respectively, permutation 
test) (Fig. 1C). This highlights the value of con- 
sidering protein-DNA contacts in predicting the 
impact of variants. 

On the basis of these results, we expanded our 
approach to other TF families. For each variant 
we considered multiple criteria, including (i) po- 
sition of the residue relative to the protein-DNA 
interface in homologous cocrystal structures (fig. 
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Fig. 1. Evaluation of coding variation in TF DBDs. (A) Number of 
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unique DBDPs in 1kG (Phase 3), ESP 6500, or ExAC v0.2 individuals. 
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test), resulting in E-scores indistinguishable from GST negative None 
controls (table S7). (D) Fraction of alleles with observed 
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determined from PBM binding profiles. Prioritized nsSNPs exclude those predicted as benign by both PolyPhen-2 and SIFT. (E) Violin plots depicting fraction of 
8-mer binding sites gained or lost by variants relative to the number of 8-mers bound by the reference allele. Gains or losses were defined as E = 0.4 for one 


allele and E < 0.4 for the other allele. *P = 0.0044, Wilcoxon rank-sum test. 
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Fig. 3. Perturbations in TF DNA binding and gene expression 
associated with HOXD13 genetic variants. (A) Heatmap depicting 
PBM E-scores of DBD alleles (rows) for all 8-mers (columns) bound 
strongly (E > 0.45) by at least one allele, with corresponding motifs (13) 
and phenotypes. Rows and columns were clustered hierarchically. Pink 
boxes highlight allele-preferred sequences with corresponding motifs, 
generated by alignment of the indicated 8-mers (14). Variants in orange 
font exhibited altered specificity. “—” indicates no known phenotype. 
(B) Scatter plot comparing 8-mer E-scores of HOXD13 reference versus 
Q325K alleles. Allele-preferred and allele-common 8-mers (6) are colored. 
(C) PBM-derived allele-preferred 8-mers are enriched (*P < 0.01, Wilcoxon 
signed-rank test) within genomic regions bound in vivo exclusively by the 
respective allele. Dashed horizontal line indicates area under the receiver 
operating characteristic curve (AUROC) = 0.5 (no enrichment or deple- 
tion). (D) Genes associated with ChIP-Seq peaks enriched for reference- 
preferred versus Q325k-preferred 8-mers are overrepresented (*P < 0.01, 
permutation test) among genes up-regulated by the same allele. Z-scores 
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were calculated with 100 random background gene sets (6). 


S)); Gi) DNA binding specificity-determining res- 
idues for particular DBD classes (fig. S2); (iii) scores 
from tools that predict mutation pathogenicity 
(9, 10); (iv) minor allele frequencies; and (v) phe- 
notypic associations from genome-wide asso- 
ciation studies (77) or known Mendelian disease 
mutations (8). 

Using these criteria, we selected 37 TF DBDPs 
(6) to assay for direct, sequence-specific DNA bind- 
ing activity (fig. S3). These DBDPs were obtained 
from the 1000 Genomes Project (1kG) Phase 2, the 
Exome Sequencing Project (ESP 6500), and the 
Exome Aggregation Consortium (ExAC). To cali- 
brate the effects of these nsSNPs, we selected 80 
Mendelian disease mutations, which are known 
or believed to be pathogenic (Fig. 1D) (8, 12). The 
117 variant DBD alleles span six major structural 
classes, representing 41 distinct TF allelic series 
(fig. S4). We assayed these 158 DBD alleles using 
universal protein-binding microarrays (PBMs) 
(6), on which each nonpalindromic 8-base pair 
sequence (8-mer) occurs on at least 32 spots (13) 
(table S4). 

We identified variant-induced changes in DNA 
binding specificity (74) (Fig. 2A) or affinity (Fig. 
2B) by comparing the enrichment (E) scores of 
each of 32,768 nonredundant, ungapped 8-mers 
represented on universal PBMs to those of the cor- 
responding reference allele (6, 13). DNA binding 
changes were reproducible across replicate PBM 
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experiments and support previously reported DNA 
binding affinity differences (table S5 and fig. S5). 
We categorized all 117 variant alleles as having 
altered DNA binding specificity, affinity, both, or 
neither (table S6). Three nsSNPs completely ab- 
rogated sequence-specific DNA binding (Fig. 2C 
and fig. S6). In total, 77 variants altered DNA bind- 
ing affinity and/or specificity (Fig. 2D). Several 
nsSNPs predicted to be damaging but not scored 
here as having altered DNA binding might cause 
subtle changes beyond the sensitivity of our ap- 
proach or alternatively affect protein-protein 
interactions. 

Compared to DBDPs, Mendelian disease mu- 
tants lost a larger fraction of 8-mers bound by 
the corresponding reference alleles (P = 0.0044, 
Wilcoxon rank-sum test), consistent with more 
extreme phenotypes being associated with more 
drastic in vitro binding changes. The overall 
difference in gained 8-mers was not significantly 
different between these two sets of variants (P = 
0.32, Wilcoxon rank-sum test; Fig. 2E). 

PBM binding profiles within an allelic series 
differed for variants associated with distinct dis- 
ease phenotypes (fig. S7), supporting results from 
a yeast one-hybrid screen of Mendelian disease 
TF mutants (15). They also provided molecular 
insights into the molecular basis of clinical hete- 
rogeneity of disease mutations affecting the same 
genes. For example, CRX is associated with Men- 
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delian diseases of retinal degeneration (J6). The 
R9OW allele, associated with the severe disease 
Leber congenital amaurosis 7 (17), lost the ability 
to bind most 8-mers bound by wild-type CRX. In 
contrast, the R41W allele, associated with cone- 
rod dystrophy 2 (78), resulted in a moderate spe- 
cificity change (fig. S7B). 

The 8-mer binding profiles of HOXD13 alleles 
displayed a range of effects; several of these al- 
leles are associated with various limb malformations 
(19) (Fig. 3A). The 1297V and N298S variants, pre- 
dicted to be benign, did not alter DNA binding 
activity. The Q325K and Q325R alleles gained 
recognition of novel motifs, consistent with those 
learned from chromatin immunoprecipitation with 
high-throughput sequencing (ChIP-Seq) data (72). 
Allele-preferred 8-mers (Fig. 3B and fig. S8A) are 
enriched within ChIP-Seq peaks bound exclusively 
by the respective allele (Fig. 3C and figs. S8B and 
S9) (P < 0.01, Wilcoxon signed-rank test). Putative 
target genes, associated with ChIP-Seq peaks en- 
riched (P < 2.2 x 10~"°, one-tailed Wilcoxon signed- 
rank test) for Q325K- or Q325R-preferred versus 
reference-preferred 8-mers (fig. S10) (6), are over- 
represented among genes up-regulated by the 
corresponding allele (P < 0.01, permutation test) 
(Fig. 3D and figs. S8C and S11), consistent with 
HOXD13 acting as a transcriptional activator 
(20). These results suggest that these variants’ 
changes in binding specificity alter genomic 
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occupancy, leading to inappropriate gene expres- 
sion through gained binding sites. 

As expected, mutations in residues that either 
contact DNA or neighbor a base-contacting res- 
idue were enriched (odds ratio = 4.3, P = 0.003, 
Fisher’s exact test) among DBDPs with altered 
DNA binding affinity or specificity (Fig. 4A). We 
also found variants at non-DNA-contacting posi- 
tions that altered DNA binding, potentially by 
affecting protein conformation or stability. We 
identified 3833 unique missense variants that are 
predicted to be damaging by both PolyPhen-2 (9) 
and SIFT (70) and occur at DNA-contacting resi- 
dues (Fig. 4B). These values are likely an under- 
estimate of damaging DBDPs across all human 
TFs (Fig. 4C). These damaging nsSNPs occur at 
lower frequencies in the ExAC population than 
do nsSNPs for which no change in DNA binding 
is predicted (P < 0.05, permutation test) (Fig. 
4D), suggesting that they are more likely to be 
deleterious. 

Per individual, there were very few (median = 2) 
nonsense DBD variants but a wide range in the 
number of putatively damaging missense variants 
(median = 9, DBDPs at DNA-contacting residues 
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and predicted as damaging by PolyPhen-2 and 
SIFT) (Fig. 4E and fig. S12). Hence, we inves- 
tigated what mechanisms might allow damaged 
DBDPs to be tolerated. TFs reported to tolerate 
homozygous loss-of-function (LoF) mutations in 
Icelanders (27) had a significantly higher fraction 
of DNA-contacting residues altered by our identi- 
fied nsSNPs (P = 6.63 x 10°, permutation test) 
(Fig. 4F). TFs with a coexpressed paralog (22) had 
a significantly higher fraction of variable DNA- 
contacting residues (P = 6.11 x 10-8, permutation 
test) (Fig. 4G); this enrichment was significant 
independent of LoF-tolerance status (P < 0.005, 
t test) (6). Additional compensation could arise 
from epistasis with cis-regulatory variants (23). 
Damaged DBDPs might be associated with un- 
diagnosed or subclinical phenotypes, variably 
penetrant phenotypes due to epistatic or gene- 
environment interactions, or phenotypes that 
present in later life. 

Our results highlight the utility of PBM pro- 
filing to reveal changes in the DNA binding 
activities of variant DBDs. PBM profiling of DBDPs 
identified through additional sequencing studies 
may elucidate disease pathologies by revealing 


alterations in DNA binding that result in tran- 
scriptional dysregulation. 

Our analyses suggest that most unrelated in- 
dividuals have a unique repertoire of TF alleles 
with a distinct landscape of DNA binding activ- 
ities. Variants with subtle changes in DNA bind- 
ing activities may confer reduced deleteriousness 
and thus have greater potential for giving rise to 
phenotypic variation. Analysis of genetic inter- 
actions among TFs, TF variants, and noncoding 
regulatory variation likely will provide insights 
into the structure of genetic variation that leads 
to phenotypic differences among people. 
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Oncogenes are activated through well-known chromosomal alterations such as gene fusion, 
translocation, and focal amplification. In light of recent evidence that the control of key genes 
depends on chromosome structures called insulated neighborhoods, we investigated whether 
proto-oncogenes occur within these structures and whether oncogene activation can occur 

via disruption of insulated neighborhood boundaries in cancer cells. We mapped insulated 
neighborhoods in Tcell acute lymphoblastic leukemia (T-ALL) and found that tumor cell genomes 
contain recurrent microdeletions that eliminate the boundary sites of insulated neighborhoods 
containing prominent T-ALL proto-oncogenes. Perturbation of such boundaries in nonmalignant 
cells was sufficient to activate proto-oncogenes. Mutations affecting chromosome 
neighborhood boundaries were found in many types of cancer. Thus, oncogene activation can 
occur via genetic alterations that disrupt insulated neighborhoods in malignant cells. 


umor cell gene expression programs are 
typically driven by somatic mutations that 
alter the coding sequence or expression of 
proto-oncogenes (J) (Fig. 1A), and identify- 
ing such mutations in patient genomes is a 
major goal of cancer genomics (2, 3). Dysregula- 
tion of proto-oncogenes frequently involves mu- 
tations that bring transcriptional enhancers into 
proximity of these genes (4). Transcriptional en- 
hancers normally interact with their target genes 
through the formation of DNA loops (5-7), which 
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typically are constrained within larger CCCTC- 
binding factor (CTCF) cohesin-mediated loops 
called insulated neighborhoods (8-10), which 
in turn can form clusters that contribute to topo- 
logically associating domains (TADs) (11, 12) 
(fig. SLA). This recent understanding of chro- 
mosome structure led us to hypothesize that si- 
lent proto-oncogenes located within insulated 
neighborhoods might be activated in cancer cells 
via loss of an insulated neighborhood bound- 
ary, with consequent aberrant activation by en- 
hancers that are normally located outside the 
neighborhood (Fig. 1A, lowest panel). 

To test this hypothesis, we used chromatin 
interaction analysis by paired-end tag sequenc- 
ing (ChIA-PET) to map neighborhoods and other 
cis-regulatory interactions in a cancer cell ge- 
nome (Fig. 1B and table S1). A T cell acute 
lymphoblastic leukemia (T-ALL) Jurkat cell line 
was selected for these studies because key T- 
ALL oncogenes and genetic alterations are well 
known (13, 14). The ChIA-PET technique gener- 


ates a high-resolution (~5 kb) chromatin inter- 
action map of sites in the genome bound by a 
specific protein factor (8, 15, 16). Cohesin was 
selected as the target protein because it is in- 
volved in both CTCF-CTCF interactions and 
enhancer-promoter interactions (5-7) and has 
proven useful for identifying insulated neigh- 
borhoods (8, 10) (fig. S1, A and B). The cohesin 
ChIA-PET data were processed using multiple 
analytical approaches (figs. S1 to S4 and table 
$2), and their analysis identified 9757 high- 
confidence interactions, including 9038 CTCF- 
CTCF interactions and 379 enhancer-promoter 
interactions (fig. S4C). The CTCF-CTCF loops 
had a median length of 270 kb, contained on 
average two or three genes, and covered ~52% 
of the genome (table S2). Such CTCF-CTCF loops 
have been called insulated neighborhoods be- 
cause disruption of either CTCF boundary causes 
dysregulation of local genes due to inappro- 
priate enhancer-promoter interactions (8, 10). 
Consistent with this, the Jurkat chromosome 
structure data showed that the majority of cohesin- 
associated enhancer-promoter interactions had 
end points that occurred within the CTCF-CTCF 
loops (Fig. 1C and fig. S2H). These results pro- 
vide an initial map of the three-dimensional (3D) 
regulatory landscape of a tumor cell genome. 
We next investigated the relationship between 
genes that have been implicated in T-ALL patho- 
genesis and the insulated neighborhoods. The 
majority of genes (40 of 55) implicated in T-ALL 
pathogenesis, as curated from the Cancer Gene 
Census and individual studies (table S3), were 
located within the insulated neighborhoods iden- 
tified in Jurkat cells (Fig. 2A and fig. S5); 27 of 
these genes were transcriptionally active and 
13 were silent, as determined by RNA sequenc- 
ing (RNA-seq) (Fig. 2A and table S4). Active 
oncogenes are often associated with super- 
enhancers (17, 18), and we found that 13 of the 
27 active T-ALL pathogenesis genes were asso- 
ciated with superenhancers (Fig. 2, A and B, 
and fig. S5A). Silent genes have also been shown 
to be protected by insulated neighborhoods from 
active enhancers located outside the neighbor- 
hood, and we found multiple instances of silent 
proto-oncogenes located within CTCF-CTCF loop 
structures in the Jurkat genome (Fig. 2, A and C, 
and fig. S5B). Thus, both active oncogenes and 
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Fig. 1. 3D regulatory landscape of the T-ALL genome. (A) Mechanisms 
activating proto-oncogenes. (B) Hi-C interaction map, TADs defined in human 
embryonic stem cells (H1), cohesin ChIA-PET interactions (intensity of blue 
arc represents interaction significance), CTCF and H3K27Ac chromatin im- 
munoprecipitation sequencing (ChIP-seq) profiles and peaks, and RNA-seq 
in Jurkat cells at the CD3D locus. ChIP-seq peaks are denoted as bars above 
ChIP-seq profiles. (©) ChIA-PET interactions at the RUNX1 locus displayed 
above the ChIP-seq profiles of CTCF, cohesin (SMC1), and H3K27Ac. FDR, 
false discovery rate. 
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Fig. 2. Active oncogenes and silent proto-oncogenes occur in insulated neighborhoods. (A) T-ALL 
pathogenesis genes. Colored boxes indicate whether a gene is located within a neighborhood, expressed, 
and associated with a superenhancer. (B) Insulated neighborhood at the active TAL1 locus. The cohesin 
ChIA-PET interactions are displayed above the ChIP-seq profiles of CTCF, cohesin (SMC1) H3K27Ac, and 
RNA-seq profile. A model of the insulated neighborhood is shown on the right. (C) Insulated neigh- 
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If some insulated neighborhoods function to BB 
prevent proto-oncogene activation, some T-ALL 3 3 
tumor cells may have genetic alterations that $5 = inj 2 
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containing T-ALL oncogenes. To investigate this Eve S = 
possibility, we identified recurrent deletions in tert + 
T-ALL genomes that span insulated neighbor- PEW? 5 Ss 
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cancer (19, 20). TALI can be activated by dele- 
tions that fuse a promoterless TALI gene to the 
promoter of STIL (19), and this was observed in 
many patient deletions (Fig. 3A). Several patient 
deletions, however, retained the TALI promoter borhood at the silent LMO2 locus. 
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Fig. 3. Disruption of insulated 
neighborhood boundaries 

is linked to proto-oncogene 
activation. (A) Cohesin 
ChIA-PET interactions and 
CTCF and cohesin (SMC1) 
binding profiles at the TALI 
locus in Jurkat cells. Patient 
deletions described in (22) are 
shown as bars below the gene 
models. The deletion on the 
bottom indicates the minimally 
deleted region identified in 
(26). (B) ChIP-seq profiles of 
CTCF, H3K27Ac, p300, CBP, 
and RNA-seq at the TALI locus 
in HEK-293T cells. The region 
deleted using a CRISPR/Cas9- 
based approach is highlighted 
in a gray box. (C) Quantitative 
reverse transcription polymer- 
ase chain reaction (qRT-PCR) 
analysis of TAL1 expression in 
wild-type HEK-293T cells (wt) 
and in cells where the neigh- 
borhood boundary highlighted 
in (B) was deleted. (D) Model 
of the neighborhood and per- 
turbation at the TAL1 locus. 
(E) 5C contact matrices in 
wild-type HEK-293T cells and 
TAL1 neighborhood boundary— 
deleted cells. An arrow indi- 
cates the position of the region 
removed in the mutant cells. 
(F) Distance-adjusted z-score 
difference (5C) maps at the 
TAL1 locus (ACTCF — wild-type 
HEK-293T). Note the increase 
in the 5C signal adjacent to the 
deleted region. CTCF and 
H3K27Ac binding profiles in 
wild-type cells are displayed for 
orientation. (G) Cohesin ChIA- 
PET interactions and CTCF and 
cohesin (SMC1) binding pro- 
files at the LMO2 locus. Patient 
deletions described in (22) are 
shown as bars below the gene 
models. (H) ChIP-Seq binding 
profile of CTCF, H3K27Ac, 
p300, CBP, and RNA-seq at 
the LMO2 locus in HEK-293T 
cells. The region deleted by a 
CRISPR/Cas9-based approach 
is highlighted in a gray box. 

(I) qRT-PCR analysis of LMO2 
expression in wild-type HEK- 
293T cells and in cells where 
the neighborhood boundary 
highlighted in (H) was deleted. 
(J) Model of the neighborhood 
and perturbation at the LMO2 


locus. (K) 5C contact matrices in wild-type HEK-293T cells and LMO2 neighborhood boundary-deleted cells. An arrow indicates the position of the region 
maps at the LMO2 locus (ACTCF — wild-type HEK-293T). Note the increase in the 
es in wild-type cells are displayed for orientation. In (C) and (I), data from n = 3 


removed in the mutant cells. (L) Distance-adjusted z-score difference (5C) 
5C signal adjacent to the deleted region. CTCF and H3K27Ac binding profi 
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independent biological replicates are displayed as means + SD; P < 0.01 between wild-type and boundary-deleted cells (two-tailed t test). 
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(end point >5 kb from promoter) but overlapped 
the CTCF boundary site of the TALI neighbor- 
hood (Fig. 3A), and TALI was active in the sam- 
ples harboring these deletions (fig. S7, A and B). 
This suggests disruption of the insulated neigh- 
borhood, allowing activation of TALI by regu- 
latory elements outside of the loop. 

We tested this idea by CRISPR/Cas9-mediated 
deletion of the TALI neighborhood boundary 
in human embryonic kidney (HEK-293T) cells 
(Fig. 3B). In these cells the TALI proto-oncogene 
is silent, as evidenced by low H3K27Ac (histone 
H3 acetylated Lys”) occupancy and RNA-seq 
(Fig. 3B). However, at least one active regulatory 
element occurs ~60 kb upstream of TALI, adja- 
cent to the CMPK1 promoter, as evidenced by 
high levels of H3K27Ac and p300/CBP (Fig. 3B) 
and enhancer reporter assays (fig. S8, A and B). 
Deletion of a ~400-base pair (bp) segment en- 
compassing the boundary CTCF site, which abol- 
ished CTCF binding (fig. S8A), caused a factor of 
2.3 induction of the TALI transcript (Fig. 3C), 
which suggests that the integrity of the neigh- 
borhood contributes to the silent state of TALI 
(Fig. 3D). In support of this model, contacts be- 
tween DNA regions that are normally within 
and outside of the neighborhood were increased 
(Fig. 3, E and F, and fig. S10). Furthermore, de- 
letion of the CTCF site in primary human T cells 
also caused a small but detectable activation of 
TAL] (fig. S8, C to G). These results are consistent 
with the idea that the silent state of the TALI 
proto-oncogene is dependent on the integrity of 
the insulated neighborhood (Fig. 3D). 

We further tested the model that site-specific 
perturbation of a loop boundary is sufficient to 
activate a proto-oncogene at the LMO2 locus. 
The LMO2 gene encodes a transcription factor 
that is overexpressed and oncogenic in some 
forms of T-ALL (14, 20). The region upstream 
of the LMO2 promoter is recurrently deleted in 
T-ALL, and these deletions are linked to LMO2 
activation (Fig. 3G); a previous study proposed 
that deletion of cryptic repressors located in the 
deleted region enables activation of LMO2 (21). 
Analysis of a T-ALL patient cohort (22) revealed 
deletions that overlap the CTCF boundary site 
of the LMO2 neighborhood, and patient cells 
harboring these deletions had generally high 
levels of LMO2 expression (fig. S9, A and B). 
CRISPR/Cas9-mediated deletion in HEK-293T 
cells of a ~25-kb segment encompassing the in- 
sulated neighborhood boundary CTCF site and 
two additional CTCF sites that could act as bound- 
ary elements caused a factor of 2 increase in the 
LMO2 transcript (Fig. 3, H to J) and a large-scale 
rearrangement of interactions around LMO2, as 
evidenced by chromosome conformation capture 
carbon copy (5C) analysis (Fig. 3, K and L, and fig. 
S10). These results indicate that the deleted CTCF 
sites contribute to the silent state of the LMO2 
proto-oncogene (Fig. 3J). 

The boundaries of chromosome neighborhoods 
may be disrupted in other cancers. A recent 
study noted that mutations in CTCF binding 
sites occur frequently in cancers (23), but it is 
unclear whether mutations in boundaries are 
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common, as only a subset of CTCF sites form in- 
sulated neighborhoods (8, 10, 24). CTCF cohesin- 
bound loops are largely preserved across cell 
types (8, 9, 24), and a set of ~10,000 constitutive 
CTCF-CTCF loops shared by GM12878 lympho- 
blastoid, Jurkat, and K562 (CML) cells (24) were 
identified for comparison (Fig. 4A, fig. S11, and 
table S8). We used the International Cancer Gen- 
ome Consortium (ICGC) database—which contains 
data for ~50 cancer types, ~2300 whole-genome 
sequence (WGS) samples, and ~13 million unique 
somatic mutations—to examine the boundaries 
of these neighborhoods for somatic point muta- 
tions found in cancer genomes (table S9). We 
found a striking enrichment of mutations at the 
CTCF boundaries of constitutive neighborhoods 
(Fig. 4B, fig. S12A, and table S10) relative to re- 
gions flanking the boundary CTCF sites (+1 kb 
of the CTCF binding motif: P < 10*, permutation 
test) (fig. SI2B), and in many instances these 
created a change in the consensus CTCF binding 
motif (fig. $12C). Nonboundary CTCF sites did 
not show such enrichment (Fig. 4B and figs. S12D 
and S14). The genomes of esophageal and liver 
carcinoma samples were particularly enriched 
for boundary CTCF site mutations (Fig. 4, C and 
D, fig. S12, D and E, fig. S13, and table S10), and 
there was no similar enrichment of mutations at 
the binding sites of other transcription factors 
(fig. S15). In these cancers, a considerable frac- 
tion of the mutated neighborhood boundary 
CTCF sites were affected by multiple mutations 
(23 mutations per site) [280/1826 (15%) in esoph- 
ageal carcinoma, 54/1030 (5%) in liver carcinoma] 
(table S10), and recurrent mutations occurred 
more frequently in neighborhood boundary CTCF 
sites relative to nonboundary CTCF sites (fig. S16, 
A to C). The genes located within the most fre- 
quently mutated neighborhoods included known 
cellular proto-oncogenes annotated in the Cancer 
Gene Census and other genes that have not been 
associated with these cancers (Fig. 4, E and F, 
and tables S11 and S12). Shown in Fig. 4, Gand H, 
are two examples of proto-oncogene-containing 
neighborhoods where the activation of the gene 
located in the neighborhood has been observed 
in the respective cancer type. These results sug- 
gest that somatic mutations of insulated neigh- 
borhood boundaries occur in the genomes of 
many different cancers. 

Our findings indicate that disruption of in- 
sulated neighborhood boundaries can cause on- 
cogene activation in cancer cells. With maps of 
3D chromosome structure such as those described 
here, cancer genome analysis can consider how 
recurrent perturbations of boundary elements 
may affect the expression of genes with roles in 
tumor biology. Our understanding of 3D chro- 
mosome structure and its control is rapidly ad- 
vancing and should be considered for potential 
diagnostic and therapeutic purposes. Because con- 
trol of 3D chromosome structure involves bind- 
ing of specific sites by CTCF and cohesin, which 
is affected by protein cofactors, DNA methyla- 
tion, and local RNA synthesis (25), advances in 
our understanding of these regulatory processes 
may provide new approaches to therapeutics 
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that have an impact on aberrant chromosome 
structures. 


REFERENCES AND NOTES 


1. B. Vogelstein, K. W. Kinzler, Nat. Med. 10, 789-799 

(2004). 

B. Vogelstein et al., Science 339, 1546-1558 (2013). 

L. A. Garraway, E. S. Lander, Cel! 153, 17-37 (2013). 

C. M. Croce, N. Engl. J. Med. 358, 502-511 (2008). 

M. H. Kagey et al., Nature 467, 430-435 (2010). 

J. H. Gibcus, J. Dekker, Mol. Cell 49, 773-782 (2013). 

D. U. Gorkin, D. Leung, B. Ren, Cell Stem Cell 14, 762-775 

(2014). 

8. J. M. Dowen et al., Cell 159, 374-387 (2014). 

9. J. E. Phillips-Cremins et al., Cell 153, 1281-1295 
(2013). 

0. X. Ji et al., Cell Stem Cell 18, 262-275 (2016). 

1. J. R. Dixon et al., Nature 485, 376-380 (2012). 

2. E. P. Nora et al., Nature 485, 381-385 (2012). 

3 

4. 


NOOR wWPD 


. S.A. Armstrong, A. T. Look, J. Clin. Oncol. 23, 6306-6315 (2005). 
. P. Van Vlierberghe, A. Ferrando, J. Clin. Invest. 122, 
3398-3406 (2012). 

5. M. J. Fullwood et al., Nature 462, 58-64 (2009). 

6. Z. Tang et al., Cell 163, 1611-1627 (2015). 

7. D. Hnisz et al., Cell 155, 934-947 (2013). 

8. J. Lovén et al., Cell 153, 320-334 (2013). 

9. L. Brown et al., EMBO J. 9, 3343-3351 (1990). 

20. J. O'Neil, A. T. Look, Oncogene 26, 6838-6849 (2007). 
21. P. Van Vlierberghe et al., Blood 108, 3520-3529 (2006). 
22. J. Zhang et al., Nature 481, 157-163 (2012). 

23. R. Katainen et al., Nat. Genet. 47, 818-821 (2015). 

24. N. Heidari et al., Genome Res. 24, 1905-1917 (2014). 

25. C. T. Ong, V. G. Corces, Nat. Rev. Genet. 15, 234-246 (2014). 
26. C. G. Mullighan et al., Nature 446, 758-764 (2007). 


ACKNOWLEDGMENTS 


Supported by NIH grants HGO02668 (R.A.Y.), CA109901 (R.A.Y.), 
NS088538 (R.J.), MH104610 (R.J.), and All20766 (M.H.P.); an Erwin 
Schrédinger Fellowship (J3490) from the Austrian Science Fund 
(FWF) (D.H.); Ludwig Graduate Fellowship funds (A.S.W.); the Laurie 
Kraus Lacob Faculty Scholar Award in Pediatric Translational 
Research (M.H.P.); Hyundai Hope on Wheels (M.H.P.); and Danish 
Council for Independent Research, Medical Sciences, individual 
postdoctoral grant DFF-1333-00106B and Sapere Aude Research 
Talent grant DFF-1331-00735B (R.O.B.). Work in the Dekker lab is 
supported by the National Human Genome Research Institute (RO1 
HG003143, U54 HGO07010, U01 HGO07910), the National Cancer 
Institute (U54 CA193419), the NIH Common Fund (U54 DK107980, 
UO1 DA 040588), the National Institute of General Medical 
Sciences (RO1 GM 112720), and the National Institute of Allergy 
and Infectious Diseases (UO1 RO1 Al 117839). J.D. is an investigator 
of the Howard Hughes Medical Institute. We thank R. Fitzgerald, 

S. Grimmond, and the ICGC Genome Projects ESAD-UK and 
OV-AU for permission to use genome sequence data. Data sets 
generated in this study have been deposited in the Gene 
Expression Omnibus under accession number GSE68978. The 
Whitehead Institute filed a patent application based on this paper. 
R.A-Y. is a founder of Syros Pharmaceuticals, and R.J. is a founder 
of Fate Therapeutics. 


SUPPLEMENTARY MATERIALS 


www.sciencemag.org/content/351/6280/1454/suppl/DC1 
Materials and Methods 

Figs. S1 to S: 
Tables S1 to S13 
References (27-71) 


19 November 2015; accepted 18 February 2016 
Published online 3 March 2016 
10.1126/science.aad9024 


a 


HIV-1 VACCINES 


HIV-1 broadly neutralizing antibody 
precursor B cells revealed by 
germline-targeting immunogen 


Joseph G. Jardine,”’** Daniel W. Kulp,””** Colin Havenar-Daughton,”** 
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Induction of broadly neutralizing antibodies (bnAbs) is a major HIV vaccine goal. 
Germline-targeting immunogens aim to initiate bnAb induction by activating bnAb 
germline precursor B cells. Critical unmet challenges are to determine whether bnAb 
precursor naive B cells bind germline-targeting immunogens and occur at sufficient 
frequency in humans for reliable vaccine responses. Using deep mutational scanning 
and multitarget optimization, we developed a germline-targeting immunogen (eOD-GT8) 
for diverse VRCO1-class bnAbs. We then used the immunogen to isolate VRCO1-class 
precursor naive B cells from HIV-uninfected donors. Frequencies of true VRCO1-class 
precursors, their structures, and their eOD-GT8 affinities support this immunogen as a 
candidate human vaccine prime. These methods could be applied to germline targeting 
for other classes of HIV bnAbs and for Abs to other pathogens. 


evelopment of an HIV vaccine is a global 
health priority. Recent discoveries of po- 
tent broadly neutralizing antibodies (bnAbs) 
that bind to relatively conserved epitopes 
on the HIV Env glycoprotein trimer and 
protect against challenge in animal models have 


reinvigorated vaccine design efforts to induce 
bnAbs (7). However, bnAbs have not been elicited 
in standard animal models or humans. 
Germline targeting, a vaccine priming strat- 
egy to initiate the affinity maturation of select 
germline-precursor B cells, has promise to initiate 
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bnAb induction. The goals for a germline-targeting 
prime are to activate B cell precursors with bnAb 
potential, select productive (bnAb-like) somatic 
mutations, and generate an expanded popula- 
tion of memory B cells that can be boosted and 
matured subsequently to shepherd the response 
further toward bnAb development (2, 3). For a 
few HIV bnAbs, next-generation sequencing of 
antibody populations during bnAb development in 
infected individuals has allowed bioinformatic 
inference of likely human germline precursors 
(4, 5). For most bnAbs, however, true human pre- 
cursors are not known but are usually approxi- 
mated by “germline-reverted” antibodies that use 
inferred germline V and J genes and retain ma- 
ture CDR3 (complementarity-determining region 
3) loops. Because CDR3 loops typically play a ma- 
jor role in antibody affinity and specificity, germline- 
reverted bnAbs are not known to be reliable 
proxies for true germline precursors. 
VRCO1-class bnAbs are an important test case 
for germline targeting, because they are among 
the most broad and potent of HIV bnAbs and be- 
cause their germline-reverted forms show no de- 
tectable affinity for HIV Env glycoproteins (6-10). 
Knock-in mice transgenic for a germline-reverted 
VRC01-class heavy chain responded to immuni- 
zation with the germline-targeting eOD-GTS8 60- 
subunit self-assembling nanoparticle (60mer) but 
not with native-like Env trimers, providing proof 
of principle that germline-targeting immunogens 
can initiate a VRCO1-class response if well-matched 
B cells are present and competing B cells are 
strongly reduced in frequency (2, 3). Here, we 
address further critical knowledge gaps that 
obstruct the development of this (or any) germline- 
targeting immunogen as a human vaccine: Do 
the targeted bnAb precursors exist in humans? 
What is the frequency and person-to-person varia- 
tion of germline-targeting immunogen-specific 
bnAb precursors? Can the germline-targeting im- 
munogen bind the targeted human bnAb pre- 
cursors in competition with other B cells in the 
fully complex human B cell repertoire? We exam- 
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ined these questions by developing new ex vivo 
approaches and protein design methods. 

When we used the VRCO1-class germline- 
targeting immunogen eOD-GT6 (9) as bait to 
screen human naive B cells via a two-phase 
multiple-validation methodology (11) (fig. S1), 
we failed to isolate VRCO1-class B cells. We did, 
however, isolate non-VRCO1-class naive B cells 
with Ab affinities as low as 120 nM for eOD-GT6 
(fig. SI). We therefore set out to develop an im- 
proved variant of eEOD-GT6 with higher affinity 
and breadth for germline-reverted VRCO1-class 
Abs, hypothesizing that such improvements might 
translate into improved affinity for diverse true 
VRCO1-class precursor Abs. 

To improve on eOD-GT6, we used yeast display 
library screening coupled with next-generation 
sequencing (12). We screened a library of every 
point mutation at the 58 eOD:Ab interface posi- 
tions on eOD-GT77, a slightly improved version of 
eOD-GT6 (11), against each of 29 VRCO1-class Abs 
(18 germline-reverted and 11 mature bnAbs). By 
measuring binding enrichments for each muta- 
tion and antibody (Fig. 1A and fig. $2), we iden- 
tified 12 positions in eOD-GT7 at which one or 
more mutations were favorable (enriched by at 
least a factor of 2) for binding to the majority (at 
least 10 of 18) of germline-reverted bnAbs, and 
another four positions at which one or more mu- 
tations were enriched by at least a factor of 1.25 
for binding to the vast majority (at least 17 of 18) 
of germline-reverted bnAbs (Fig. 1B). To identify 
combinations of mutations predicted to confer 
the greatest binding cross-reactivity, we then 
created a library encompassing all combinations 
of a filtered set of the favorable mutations at 
those 16 positions (13) (Fig. 1C). Upon screening 
this combinatorial library against the panel of 
29 VRCOl1-class Abs, we identified a sequence, 
eOD-GTS8, predicted to have optimal breadth 
against the entire panel (Fig. 1C, figs. S3 and S4, 
and table S1). 

Relative to eCOD-GT6, eOD-GT8 demonstrated 
superior affinity and breadth of binding to 
germline-reverted Abs (Fig. 1D and table 82). 
eOD-GTS bound to all germline-reverted Abs in 
the panel, whereas eOD-GT6 bound to only 8 of 
14 Abs with dissociation constants (Kp) of less 
than 100 uM. For those eight germline-reverted 
Abs, the geometric mean affinity of eOD-GT8 
was higher than that of eCOD-GT6 by a factor of 
2100; eOD-GTS8 also had improved affinity (fac- 
tor of 3) for VRCO1-class bnAbs. The tightest eOD- 
GTS8 binding detected was for germline-reverted 
PGV20, with a Kp of 508 {M (95% confidence 
interval, 234 to 943 fM) (Fig. ID and fig. S5), a 
factor of 5900 improvement over eEOD-GT6 (Kp = 
3 nM) and a factor of 33 million improvement 
over the original eOD construct, eOD Base [Kp = 
17 uM (9)]—a remarkable affinity improvement 
for a protein-protein interface. 

To examine whether VRCO1-class precursors 
targeted by eOD-GTS8 exist in humans, we per- 
formed epitope-specific B cell sorting from a pool 
of peripheral blood mononuclear cells from healthy, 
HIV-seronegative donors. Epitope-specific B cells 
bound tetramers of eOD-GTS8 but not tetramers 


of eOD-GT8-KO, a variant of EOD-GT8 with mu- 
tations abrogating binding by VRCO1-class germline- 
reverted Abs. After sequencing immunoglobulin 
(Ig) genes from single sorted cells, we searched 
for VRCOl1-class antibody sequences—that is, 
those with a heavy chain that used VH1-2 alleles 
*02, *03, or *04 and a light chain with a 5-amino 
acid CDR3 (9, 14). After sorting 2.4 million IgM*/ 
IgG /CD19* B cells pooled from nine donors, we 
recovered a single GT8*/GT8-KO™ Ab that qual- 
ified as a VRCO1-class precursor. This Ab, VRCO1c- 
HuGLl, bound to eOD-GT8 with a Kp of 22 uM 
and had no detectable affinity for eOD-GT8-KO 
(fig. S6). 

To assess both the percentage of people who 
possess VRCO1-class germline precursor B cells 
and the frequency of VRCO1-class germline pre- 
cursor B cells within a given donor, we screened 
naive B cells from 15 healthy, HIV-seronegative 
donors individually rather than pooled. For 7 
of 15 samples, we used the two-phase multiple- 
validation methodology that first assesses spec- 
ificity by probe binding in flow cytometry and 
then confirms specificity and lack of polyreactiv- 
ity by single-cell secreted IgM (fig. $7); for eight 
subsequent donors, we relied on sorting speci- 
ficity alone. For optimal cell sorting sensitivity, 
B cells were required to simultaneously bind two 
eOD-GTS8 probes multimerized differently [trimer 
(“tri”) and streptavidin tetramers (“SA”)] while 
not binding eOD-GT8-KO-SA (Fig. 2A and fig. S7). 
For the 15 donors, the mean frequency of eOD- 
GTs™*/SA* B cells among 61.6 million naive B 
cells sorted was 0.0056% (Fig. 2B). Strikingly, a vast 
majority (84 + 14%) of these eEOD-GT8™*/“* B cells 
did not bind eOD-GT8-KO-SA (Fig. 2C), which sug- 
gests that naive B cell reactivity to eOD-GTS is highly 
focused to the CD4 binding site (CD4bs) (15). 

Paired heavy and kappa light chain sequences 
were recovered from 173 eOD-GT8"*/“4*/eOQD- 
GT8-KO  B cells. All sequences were essentially 
germline, confirming the naive B cell sorts. Half 
(50%) of these B cells were VH1-2, whereas only 
4% of control B cells from reference (16) were 
VH1-2 (x? = 29.9, P < 0.0001; Fig. 2D and fig. $8). 
Among these 87 VH1-2* B cells, 26 had a light 
chain CDR3 (L-CDR3) length of 5 amino acids, a 
factor of 85 enrichment relative to control B cells 
(x? = 32.6, P < 0.0001; Fig. 2E). Twenty-five of the 
26 used the VH1-2*02 allele and one used VH1- 
2*04: (table $3); thus, 15% (26/173) of GT8™*/S4*/ 
eOD-GT8-KO B cells were VRCO1-class. In total, 
we identified 27 independent VRCO1-class naive 
B cells, including VRCO1c-HuGL1. 

In addition to the VH1-2 alleles and critical 5- 
amino acid L-CDR3, VRCO1-class bnAbs possess 
several additional defining features, including a 
consensus L-CDR3 of Gln-Gln-Tyr-Glu-Phe (QQYEF). 
The majority of VRCO1-class precursors we iso- 
lated contained a QQYxx partial VRCO1-class 
consensus motif that was significantly enriched 
relative to control B cells (67% versus 11%; x? = 
8.2, P < 0.0001; Fig. 2F). Furthermore, 11% con- 
tained a QQYEx L-CDR3 motif (versus 1.5% of 
control B cells), one mutation away from a per- 
fect mature VRCO1-class L-CDR3 (Fig. 2F). In ad- 
dition, the L-CDR1 loop is under strong selective 
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pressure during VRCO1-class bnAb affinity matu- 
ration to minimize clashes with gp120 (6, 17). VRCO1- 
class bnAb L-CDR1 loops generally become very 
short (2 to 6 amino acids) through deletion, or 
retain a germline length of 6 amino acids and 
add flexible glycines (77). Of the 27 VRCO1-class 
precursors isolated by eOD-GT8, 23 used V« 
genes containing L-CDR1 loops of 6 or 7 amino 
acids (Fig. 2G), thus confirming potential to de- 
velop into VRCO1-class bnAbs. Indeed, 17 of the 
VRCOl1-class naive B cells had V« genes used in 
known VRCO1-class bnAbs (Fig. 2H). At least 24 
of the VRCO1-class precursors had H-CDR3 lengths 
of 10 to 19 amino acids (Fig. 21) (18), consistent 
with known VRCO1-class bnAb lengths of 10 to 19 


amino acids. Thus, not only are the eOD-GT8 
isolated naive B cells highly enriched for VRCO1- 
class core characteristics of VH1-02 and a 5-amino 
acid L-CDR3, they possess further refined se- 
quence attributes of VRCO1-class bnAbs. 
Combining data from the 15 donors analyzed 
individually, the overall frequency of recovered 
VRCO01-class precursors was 1 in 2.4 million naive 
B cells (Fig. 2J), consistent with both our first 
pooled sort and a previous bioinformatically es- 
timated range (17). The observed counts were con- 
sistent with a Poisson distribution with constant 
frequency of 1 in 2.4 million (Fig. 2K) (77), which 
suggests that VRCO1-class precursors occur at a 
consistent rate among 96% of humans possess- 


ing the necessary VH1-2 alleles (9). Adults have 
an estimated 10”° to 10" B cells, and lymph nodes 
each have ~50 million B cells, of which ~65 to 
75% are naive B cells (79). Thus, our results indi- 
cate that VRCO1-class precursor B cells are rela- 
tively common in humans: At least 2700 to 31,000 
eOD-GT8-reactive VRCO1-class naive B cells are 
likely present in nearly all potential human vac- 
cine recipients, with ~15 such B cells in each lymph 
node, at any given time (20). 

The Kp values of 24 isolated VRCO1-class pre- 
cursors for monovalent eOD-GT8 ranged from 
57 uM to 125 nM, with a geometric mean Kp 
of 3.4 uM (Fig. 2L and table S4), weaker than 
germline-reverted VRCO1-class Abs by a factor of 
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cross. (C) Sequence logos 
depicting amino acids at each of 
16 positions in the combinatorial 
library (top), the sequences 
selected from the combinatorial 
library for improved binding to germline-reverted VRCOl-class bnAbs 
(middle), and the final sequence of eOD-GT8 (bottom). Abbreviations: A, Ala; 
C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; |, lle; K, Lys; L, Leu; M, Met; N, Asn; 
P, Pro; Q, Gin; R, Arg; T, Thr; V, Val; W, Trp; Y, Tyr. (D) Surface plasmon res- 
onance (SPR) dissociation constants measured for both germline-reverted 
and mature VRCOl-class bnAbs against eOD-GT6 and eOD-GT8. Solid blue 
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eOD-GT6 


eOD-GT8 eOD-GT6 


VRC01-class bnAbs 


eOD-GT8 


Germline-reverted Abs 


lines show geometric mean measured over all the data, using the value Kp = 
100 uM for samples with Kp > 100 uM; dashed blue lines show geometric 
means computed for the eight germline-reverted Abs or 12 bnAbs for 
which Kps < 100 uM could be measured for both eOD-GT6 and eOD-GT8. 
The lowermost dotted line signifies the limit of detection for our SPR instru- 
ment (16 pM); Kps below this value were measured by KinExa. 
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590 (geometric mean Kp = 5.8 nM for the panel), 
most likely due to the naive CDR3 loops in the 
former as opposed to the affinity-matured CDR3 
loops in the latter. The VRCO1-class naive B cell 
affinities are in the range expected to allow a 
multivalent eOD-GT8 immunogen, such as eOD- 
GTS8 60mer (2, 9), to activate B cells and initiate 
germinal centers (21, 22). Our data also suggest 
that eOD-GTS has promise to produce VRCO01- 
class memory even given competition from non- 
VRCO01-class B cells, as eOD-GT8 exhibited a high 
degree of CD4bs immunofocusing (Fig. 2C) and 


VRC01-class precursors had an affinity advan- 
tage (factor of =3) over non-VRCO1-class CD4bs 
epitope-binding precursors (Fig. 2L). The frequen- 
cies and eOD-GTS8 affinities of bona fide VRCO1- 
class precursors isolated here warrant human 
immunization studies with eOD-GT8 60mer 
nanoparticles. 

Only 2 of 20 tested VRCOlI-class precursors 
had detectable affinity for eOD-GT6 (Fig. 2L). 
Equilibrium binding Kp values were 36 uM and 
69 uM, and these Abs had two of the highest af- 
finities for eOD-GT8 at 506 nM and 258 nM, re- 


spectively (table S4). These data, combined with 
the failure of eOD-GT6 probe B cell screens to 
isolate VRCO1-class precursors, suggest that the 
engineered breadth and affinity improvements 
in eOD-GTS8 represent a major advance toward 
practical utility in human vaccination. 

We sought to confirm that the isolated VRCO1- 
class precursors engage the CD4bs in the same 
structural binding mode as VRCO1-class bnAbs 
(6, 17, 23-25) and germline-reverted VRCO1 (9). 
We solved the crystal structure of isolated pre- 
cursor VRCOlc-HuGL2 (eOD-GT8 Kp = 368 nM) 
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Fig. 2. eOD-GT8-binding VRCOl1-class naive B cells exist in healthy hu- 
man donors. (A) eOD-GT8* naive CD19*lgG" B cells. (B and C) eOD-GT8* B 
cell frequency (B) and eOD-GT8 KO™ cells (C) among eOD-GT8* B cells in 
individual donors. (D) VH1-2 usage among eOD-GT8*/eOD-GT8 KO" sorted B 
cells (n = 173) versus control B cells. VH1-2 (red) allele frequencies are indicated. 
(E) B cells expressing a 5-amino acid L-CDR3 among VH1-2" B cells isolated by 
eOD-GT8 versus control B cells. (F) L-CDR3 sequence logos of VRCO1-class 
bnAbs (top), VRCO1-class naive precursors (middle), and control B cells (bottom). 
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(G) L-CDRI1 lengths of 27 VRCOl1-class naive B cells. (H) Light chain V gene usage 
of 27 VRCO1-class naive B cells. Known VRCO1-class bnAb Vx genes are in red. 
(1) H-CDR3 lengths of VRCO1-class naive B cells versus control B cells. (J) Total 
B cells screened and VRCOl1-class naive B cells found in 15 individuals. (K) Poisson 
distribution modeling of the number of VRCO1-class naive B cells. Vertical lines 
show the 2.5% and 975% quantiles. (L) SPR dissociation constants for eOD- 
GT6 or eOD-GT8 binding to VRCO1-class or non—VRCO1-class Abs derived from 
eOD-GT8-sorted human naive B cells. Solid red lines indicate geometric mean. 
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in complex with eOD-GTS8 in two crystal forms 
(1222, 2.16 A, and C2, 2.44 A; table S5). Com- 
parison of this structure with the complex of 
core-gp120 bound to VRCO1 [PDB ID: 3NGB (6)] 
shows the same binding mode (Fig. 3A), includ- 
ing specific H-CDR2 and L-CDR3 conformations 
(Fig. 3B) (26) that together account for more 
than 67.2% of the Fv domain buried surface area 
(Fig. 3C and table S6). When interface residues of 
eOD-GTS and core-gp120 are aligned, Vy and V;, 
of VRCO1c-HuGL2 and VRCO1 have high similar- 
ity (Co RMSD 0.7 A; Fig. 3A and fig. S9). These 
structural observations confirm VRCO1c-HuGL2 
as a bona fide VRCO1-class precursor and support 


. VRCO1c-HuGL2 


(aw 


the conclusion that all of the eOD-GTS8-specific 
naive B cells using VH1-2 and a 5-amino acid 
L-CDR3 are bona fide VRCO1-class precursors. 
Comparison of the eOD-GT8-VRCO1c-HuGL2 
structure with a 1.82 A unliganded VRCOlc-HuGL2 
structure shows that the important H-CDR2 and 
L-CDR3 loops are preconfigured in the unbound 
state and do not require any conformational 
changes for engagement with gp120 CD4bs (Fig. 
3D), heightening the appeal of VRCO1-class germ- 
line targeting. A 2.9 A unliganded structure of 
eOD-GTS8 (Fig. 3E and fig. S10) demonstrates 
faithful mimicry of the VRCO1-class antibody- 
bound conformation (27), thus helping to explain 


the increased affinity of ECOD-GTS8 for true VRCO1- 
class bnAb precursors. 

The interaction of the naive human B cell 
repertoire with vaccine antigens has not been 
characterized previously. Given the vast immuno- 
globulin sequence space, direct probing of the 
human naive B cell repertoire was a critical test 
of the physiologically relevant binding potential 
of the germline-targeting immunogen. The anti- 
body sequence features, binding affinities, and 
high structural similarity of the eEOD-GT8-specific 
naive B cell-derived antibodies to VRCO1 all demon- 
strate the power of germline-targeting design when 
combined with human B cell probing. Similar 
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Fig. 3. Structural analysis of eOD-GT8 and human germline antibody 
VRCOlc-HuGL2 complex. (A) Crystal structures of VRCOlc-HuGL2 + eOD-GT8 
(blue, LC; salmon, HC; orange, eOD-GT8) and of mature VRCO1 + gp120 (PDB ID: 
3NGB, in white) shown in the same orientation, showing eOD-GT8 superimposed 
on gp120, and showing only the antibody Fv regions for clarity. (B) Comparison of 
the H-CDR2 and L-CDR3 conformations from the structures in (A). (©) Com- 
parison of buried surface areas for the Vj, and V__ residues of VRCO1c-HuGL2 and 
mature VRCO1 + gp120, in their bound forms. (D) Comparison of H-CDR2 and 
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L-CDR3 conformations of unliganded and eOD-GT8-liganded VRCO1c-HuGL2 
Fab. All atoms of V,, and V_ were aligned. In the left image, H-CDR2 and L-CDR3 
are shown as sticks; in the right image the CDRs are shown according to B-factors 
reporting local structural flexibility using a relative scale in which increasing wire 
thickness and warmness of color (blue to red) indicates increasing mobility. 
(E) Crystal structure of unliganded eOD-GT8 shown in cartoon representation 
(left) and a superposition of unliganded and VRCO1c-HuGL2-bound forms 
of eOD-GT8 (right; Ca RMSD = 0.4 A). 
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methods, including both protein design and hu- 
man B cell probing methods, could be used to 
improve and evaluate germline-targeting immu- 
nogens for other classes of HIV bnAbs and for 
Abs against other pathogens. These methods may 
be particularly important to develop and test 
germline-targeting approaches for bnAbs that 
rely heavily on HCDR3 and hence may have lower 
precursor frequencies. 
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Clonal neoantigens elicit T cell 
immunoreactivity and sensitivity to 
immune checkpoint blockade 
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As tumors grow, they acquire mutations, some of which create neoantigens that 
influence the response of patients to immune checkpoint inhibitors. We explored the 
impact of neoantigen intratumor heterogeneity (ITH) on antitumor immunity. Through 
integrated analysis of ITH and neoantigen burden, we demonstrate a relationship 
between clonal neoantigen burden and overall survival in primary lung adenocarcinomas. 
CD8* tumor-infiltrating lymphocytes reactive to clonal neoantigens were identified in 
early-stage non-small cell lung cancer and expressed high levels of PD-1. Sensitivity 

to PD-1 and CTLA-4 blockade in patients with advanced NSCLC and melanoma was 
enhanced in tumors enriched for clonal neoantigens. T cells recognizing clonal 
neoantigens were detectable in patients with durable clinical benefit. Cytotoxic 
chemotherapy-induced subclonal neoantigens, contributing to an increased mutational 
load, were enriched in certain poor responders. These data suggest that neoantigen 
heterogeneity may influence immune surveillance and support therapeutic 


developments targeting clonal neoantigens. 


ecent studies have highlighted the rele- 

vance of tumor neoantigens in the recog- 

nition of cancer cells by the immune system 

(-4), prompting a renewed interested in 

personalized vaccines and cell therapies 
that target cancer mutations (5, 6). However, 
although genomic data are revealing the extent 
of genetic heterogeneity within single tumors (7), 
the influence of intratumor heterogeneity (ITH) 
upon the neoantigen landscape and sensitivity to 
immune modulation is unclear. 


To explore neoantigen heterogeneity and 
its influence on antitumor immunity in early- 
stage non-small cell lung cancer (NSCLC), we 
applied a bioinformatics pipeline to seven pri- 
mary NSCLCs subjected to multiregion se- 
quence analysis (table S1) (8, 9). In total, 2860 
putative neoantigens were predicted across 
the cohort, with a median of 326 neoantigens 
predicted per tumor (range of 80 to 741) (Fig. 1A). 
Neoantigen heterogeneity varied considerab- 
ly, with an average of 44% neoantigens found 
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heterogeneously, in a subset of tumor regions 
(range of 10 to 78%). 

To address the clinical relevance of neoantigen 
burden and, specifically, the importance of clonal 
(present in all tumor cells) versus subclonal (present 
only in a subset) neoantigens, we subjected a 
predominantly early-stage cohort of 106 stage 
I/II, 43 stage III/IV, and 1 unknown-stage lung 
adenocarcinoma (LUAD) and 92 stage I/II and 
32 stage III/IV lung squamous cell carcinoma 
(LUSC) cases from The Cancer Genome Atlas 
(TCGA) to neoantigen and clonality analysis (10-12) 
(Fig. 1B). In this setting, to determine clonality 
from sequencing of a single sample, the cancer 
cell fraction, which describes the proportion of 
cancer cells harboring a mutation, was deter- 
mined for each neoantigen (73). 

A high neoantigen burden, defined as the up- 
per quartile of neoantigen load, was associated 
with significantly longer overall survival in LUAD 
(P = 0.025) (Fig. 1, C and D, and fig. S1A), and a 
trend for homogeneous tumors (neoantigen ITH < 
1%) to have longer overall survival times as com- 
pared with that of heterogeneous tumors was also 
observed (P = 0.061) (fig. SIB). Although tumors 
with a high burden of neoantigens were found to 
be significantly more homogeneous than those 
with a low burden of neoantigens (P < 0.0001, 
Wilcoxon rank-sum test) (fig. SIC), a combina- 
tion of neoantigen ITH and neoantigen burden 
(as outlined in the schematic in Fig. 1C) was more 
significant than simply considering either metric 
alone and was observed across multiple different 
neoantigen ITH thresholds (without ITH thresh- 
old, P = 0.025; ITH threshold = 0, P = 0.019; ITH 
threshold = 0.01, P = 0.0096; ITH threshold = 
0.05, P = 0.021) (Fig. 1D), remaining significant in 
multivariate analysis when including the tumor 
stage (table $2). 
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Despite a comparable range of predicted neo- 
antigens in LUSC, no statistically significant as- 
sociation between overall survival and neoantigen 
load was observed in this subtype, even when 
incorporating neoantigen ITH (fig. S2, A to D). 
To investigate the reason for this disparity, we 
explored whether any immune-regulatory genes 
were differentially expressed between these two 
cancer types. Human lymphocyte antigen (HLA) 
class I genes—including HLA-A, HLA-B, HLA-C, 
HLA-E, HLA-F, and HLA-G, as well as B. micro- 
globulin (82M), a component of the major histocom- 
patibility complex (MHC) class I molecule— 
were expressed at a significantly lower level 
in LUSC as compared with LUAD (fig. S3A and 
table S3A), and this difference was observed across 
all levels of neoantigen burden (fig. S3B and ta- 
ble S3B). HLA class I genes were also down- 
regulated compared with matched normal samples 
in LUSC (table S3C). These data suggest that 
the presence of a high number of clonal neo- 
antigens in homogeneous LUAD may favor ef- 
fective immune surveillance, whereas in LUSC, 
immune escape may be more prevalent through 
HLA down-regulation. 

We next evaluated whether immune-related 
genes were differentially expressed between homo- 
geneous LUAD tumors (<1% neoantigen ITH) with 
a high clonal neoantigen burden (greater than or 
equal to upper-quartile clonal neoantigens) com- 
pared with heterogeneous (>1% neoantigen ITH) 
or low clonal neoantigen burden tumors (less 
than upper-quartile clonal neoantigens). Eight 
genes were found to be significantly differentially 
expressed between these two groups (table S4A). 
Programmed cell death ligand-1 (PD-L1) and the 
proinflammatory cytokine interleukin-6 (IL-6) were 
the most significantly differentially expressed genes, 
up-regulated in the homogeneous and high clonal 
neoantigen group. When we specifically compared 
tumors in the upper quartile of clonal neoantigen 
burden with tumors in the lower quartile, we iden- 
tified an additional 25 significantly differentially 
expressed genes (table S4B and fig. S4A). CD8A, 
CDS8B, and genes associated with antigen presen- 
tation (TAP-1, TAP-2, and STAT-1), T cell migration 
(CXCL-10 and CXCL-9), and effector T cell function 
[interferon- y (IFN-y) and granzymes B, H, and A] 
were up-regulated in the high clonal neoantigen 
cohort and found to cluster together, indicating 
coexpression (fig. S4B). PD-1 and lymphocyte 
activation gene 3 (LAG-3)—negative regulators of 
T cell function (J4)—were also identified in this 
cluster, as were the ligands PD-L1 and PD-L2. 

These data suggest that a high clonal neo- 
antigen burden in LUAD is associated with an 
inflamed tumor microenvironment enriched with 
activated effector T cells, potentially regulated 
by inhibitory immune checkpoint molecules and 
their ligands. We therefore attempted to identify 
and characterize T cells reactive to neoantigens 
in patients with early-stage NSCLC. We focused 
on two tumors, LO11 and L012, with a comparable 
number of predicted neoantigens and a similar 
smoking history, but divergent levels of neoantigen 
ITH (8 versus 74% heterogeneous predicted neo- 
antigens) (Fig. 2, A to C). We used 288 and 354 


putative neoantigen-loaded, HLA-matched mul- 
timers derived from LO11 and L012, respectively, 
to screen CD8* T cells expanded from individual 
tumor regions and adjacent normal lung tissue, 
using a previously described high-throughput 
method (Fig. 2, D and E) (15). 

CD8* T cells reactive to mutant MTFR2?°°Y 
(FAFQEYDSF) were identified in LOI, whereas 
in L012, two distinct CD8* T cell responses to mu- 
tant CHTF18"°Y (LLDIVAPK) and MYADM*®°Y 
(SPMIVGSPW) were observed (Fig. 2, D and E, 
and fig. S5, A and B). MTFR2?°6”, CHTFI8"", 
and MYADM®*°W all represent clonal neoanti- 
gens, suggesting that immune-reactivity against 
clonal neoantigens can be detected in both ho- 
mogeneous and heterogeneous NSCLC. High HLA 
binding affinity was predicted for MTFR2??6Y 
and CHTF18"”°°Y in both wild-type and mutant 
forms, but only the mutant peptide was found to 
elicit a T cell response. Higher binding affinity to 
mutant versus wild-type form was predicted for 
MYADM®®W; however, in this case, reactivity to- 
ward wild-type peptide was also observed (fig. S5C). 
The mutation in the MYADM®*°™’ peptide lies in 
the anchor residue, primarily affecting HLA bind- 
ing and not T cell recognition. Although the data 
suggest that T cells in this patient can recognize 
both mutant and wild-type peptides when sta- 
bilized within a MHC-multimer system, the very 
low predicted affinity of the wild-type peptide 
to HLA would be expected to prevent adequate 
presentation in vivo. 

We next used MHC multimers that identify 
neoantigen-reactive T (NAR-T) cells to character- 
ize NAB-T cells in unexpanded samples (Fig. 3, A 
to D). MTFR2?%°°Y-reactive CD8* T cells, identi- 
fied in unexpanded LO11, were analyzed by means 
of multicolor flow cytometry. We assessed rela- 
tive expression of co-inhibitory immune check- 
point molecules and effector cytokines between 
tumor-infiltrating CD4"FoxP3* (regulatory T cell), 
CD4*FoxP3” (CD4* helper T cell), CD8* multimer 
negative, and CD8* multimer-reactive (MTFR2?”*) 
T cell subsets. MTFR2?*6"* CD8* T cells ex- 
pressed high levels of co-inhibitory receptors 
PD-1 and LAG-3 (Fig. 3C), which is consistent 
with our bioinformatics findings (fig. S4). Almost 
all NAR-T cells (97%) expressed high levels of 
PD-1, compared with 49% of multimer-negative 
tumor-infiltrating CD8* T cells. CTLA-4 expres- 
sion was largely confined to CD4*FoxP3* regu- 
latory T cells, which is consistent with preclinical 
findings (16). PD-1* MTFR2 ?°°¥-reactive CD8* 
T cells coexpressed high levels of granzyme B 
(GzmB) (74.8%) (Fig. 3D). Characterization of 
CHTF18"”°°"- and MYADM®*°“’-reactive CD8* 
T cells mirrored findings in LOI, with high ex- 
pression of PD-1 observed in 97% and 99.6% of 
CHFT18"°°"- and MYADM®*™’-reactive CD8* 
T cells, respectively (fig. S5, D and E). 

Given the potential ability of clonal neoanti- 
gens to promote priming and infiltration by neo- 
antigen reactive T cells expressing high levels 
of PD-1, we explored whether response to PD- 
1 blockade in patients with advanced NSCLC 
may be influenced by neoantigen ITH. Exome 
sequencing data from a recent study in which 
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34 patients were treated with pembrolizumab— 
an antibody targeting PD-1—was obtained (table 
S5) (2), and the clonal architecture of each tumor 
estimated (possible for 31 of 34 tumors). 
Neoantigen burden was related to clinical re- 
sponse to pembrolizumab, with a high neoantigen 
repertoire associated with improved outcome, as 
previously reported (Fig. 4A). However, consist- 


ent with the importance of clonal neoantigens, 
the clinical efficacy of PD-1 blockade also ap- 
peared related to the clonal architecture of each 
tumor (Fig. 4A), with tumors derived from pa- 
tients with no durable benefit [defined as in (2)] 
exhibiting significantly higher neoantigen ITH 
than that of tumors from patients with a durable 
clinical benefit (P = 0.006, Wilcoxon rank sum 


test). Almost every tumor (12 of 13) that exhibited 
a low neoantigen subclonal fraction (<5% sub- 
clonal) and high mutation burden (=70, median 
clonal neoantigens of the cohort) demonstrated 
durable clinical benefit with anti-PD-1 therapy. 
Conversely, only 2 out of 18 tumors with a high 
subclonal neoantigen fraction (>5%) or low clonal 
neoantigen burden benefited from pembrolizumab 
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Fig. 1. Heterogeneity and prognostic value of neoantigen landscape in 
primary NSCLC. (A) Total putative neoantigen burden in multiregion se- 
quenced NSCLC tumors. Proportion of clonal neoantigens, identified ubiqui- 
tously in every tumor region, are shown in blue; shared subclonal neoantigens, 
identified as shared in multiple tumor regions but not all, are shown in yellow; and 
private subclonal neoantigens, identified in only one tumor region, are in red. 
(B) Total putative neoantigen burden in TCGA LUAD tumors. Proportion of neo- 
antigens arising from clonal (blue) or subclonal (red) mutations is shown. 
(C) Schematic illustrating use of different neoantigen ITH thresholds, with bar 
plot showing separation into the two groups. Without an ITH threshold, samples 
are simply grouped according to upper quartile of total neoantigen burden. For 
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— 2=upper quartile clonal neoantigens and low neoantigen ITH (s ITH thresh) 
— < upper quartile clonal neoantigens or high neoantigen ITH (> ITH thresh) 


each ITH threshold, the upper quartile of clonal neoantigens is used to separate 
tumors with high and low clonal neoantigen burden, and the neoantigen ITH 
threshold further groups samples. For example, an ITH threshold = O involves 
grouping tumors with high clonal neoantigen burden and zero neoantigen het- 
erogeneity separately from those with low clonal neoantigen burden or any 
neoantigen heterogeneity. (D) Overall survival curves for samples by using 
different ITH thresholds. Shown are without an ITH threshold [log-rank, P = 
0.025, HR = 0.47 (0.24—0.92)]; ITH threshold = O [log-rank, P = 0.019, HR = 
0.21 (0.051-0.88)]; ITH threshold = 0.01 [log-rank, P = 0.0096, HR = 0.33 
(0.14-0.79)]; and ITH threshold = 0.05 [log-rank, P = 0.021, HR = 0.45 (0.22- 
0.90)]. The number of patients in each group is listed below the survival curves. 
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(Y2087 and SB10944). For example, despite a large 
neoantigen burden, ZA6505 exhibited progressive 
disease, relapsing after 2 months. ZA6505 was 
one of the most heterogeneous tumors within 
the cohort, with over 80% of mutations classified 
as subclonal. 

Tumors with both a high clonal neoantigen 
burden and low neoantigen ITH were associated 
with significantly longer progression-free survi- 
val, and this relationship remained robust to the 
choice of ITH threshold, with lower hazard ratios 
observed as compared with the use of neo- 
antigen burden alone (Fig. 4B). The majority of 
clonal neoantigens could be attributed to smoking- 
induced mutations (Fig. 4A). Greater PD-L1 ex- 
pression was observed in tumors harboring a 
large clonal neoantigen burden and low neo- 
antigen heterogeneity compared with the remain- 
ing tumors (P = 0.0017, 7” test) (Fig. 4A and fig. S6). 

Next, we obtained data from 64 melanoma 
patients treated with either ipilimumab or tre- 
melimumab, which are antibodies against CTLA- 
4 (4). Clonal architecture analysis was possible 
for 57 of 64 tumors, and significantly improved 
overall survival was observed in tumors exhib- 
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Fig. 2. Prediction and identification of neoantigen-reactive T cells in NSCLC 
samples. (A) Putative neoantigens predicted for all missense mutations in LO11. 
The MTFR2°°°8" neoantigen (FAFQEYDSF) is highlighted. (B) Putative neoantigens 
predicted for all missense mutations in LO12. The CHTF1 
(LLLDIVAPK) and MYADM™ neoantigen (SPMIVGSPW) are indicated. (C) Evo- 
lutionary trees for LO11 and LO12 based on predicted neoantigens. (D and E) MHC- 
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iting a low neoantigen ITH and a high clonal 
neoantigen burden. This relationship was ob- 
served when multiple different ITH thresholds 
were used, similar to the NSCLC cohort (ITH 
threshold = 0.01, P = 0.008; ITH threshold = 
0.02, P = 0.011; ITH threshold = 0.05, P = 0.083) 
(Fig. 4C). The relationship between neoantigen 
burden and survival outcome was not statis- 
tically significant without an ITH threshold (P = 
0.083) (Fig. 4C). 

To address whether radiation or cytotoxic ex- 
posure might stimulate production of subclonal 
neoantigens that could contribute to total neo- 
antigen burden but not the efficacy of checkpoint 
blockade, sequencing data from a more heavily 
pretreated melanoma cohort, comprising 110 tu- 
mors, were obtained (7). For the subset of tumors 
for which clonal analysis was possible (78 of 110 
tumors, a smaller and less adequately powered 
cohort as compared with the published analysis), 
total neoantigen burden was not significantly 
associated with efficacy of immune checkpoint 
inhibition [classified as in (17)], although a trend 
was observed (P = 0.24, Wilcoxon rank sum test) 
(fig. S7A). However, an enrichment for tumors 
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exhibiting high neoantigen heterogeneity or low 
clonal neoantigen burden (both stratified accord- 
ing to the median of the cohort) reached bor- 
derline significance in patients with minimal or 
no benefit compared with patients exhibiting a 
clinical benefit (P = 0.06, Fisher’s exact test). 
Neoantigen burden was not found to be signif- 
icantly associated with overall survival in this 
cohort (fig. S7B). Two of the most heterogeneous 
tumors (Pat58 and Pati51) with minimal or no 
benefit were among those treated with the al- 
kylating agent dacarbazine (DTIC) before anti- 
CTLA therapy, and for both, >98% of subclonal 
mutations were attributable to mutational Sig- 
nature 11, a signature associated with prior 
exposure to alkylating agents (78, 19). One pa- 
tient with stable disease—Pat80, who was also 
treated with DTIC before anti-CTLA-4 therapy— 
also harbored an increase in Signature 11 and 
progressed by 6 months [classified as no durable 
benefit according to (2)]. These data suggest that 
therapy may induce subclonal mutations that 
fail to drive an efficient antitumor response, al- 
though further data are needed to confirm this 
observation. 
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multimer screening of expanded, region-specific, tumor-infiltrating CD8* T lym- 
phocytes and healthy donor (HD) CD8* PBMC controls with candidate neo- 
antigens (LOI, n = 288; LO12, n = 354) and control HLA-matched viral peptides 
(LO11, n = 10; LO12, n = 9). Frequency of CD8* MHC-multimer—positive cells 
out of total CD3*CD8* tumor-infiltrating lymphocyte (TILs) is displayed for (D) 
and (E), with size of symbol increasing with frequency. 
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Last, we reasoned that T cells recognizing clo- 
nal antigens should be detectable in patients 
deriving favorable responses to checkpoint block- 
ade. Previous analysis of peripheral blood lym- 
phocytes (PBLs) from CA9903, a LUAD patient 
with an exceptional response to pembrolizumab, 
identified a CD8* T cell population in autologous 
PBLs, recognizing a predicted neoantigen result- 
ing from a HERC1”?*S mutation (ASNASSAAK) 
(2). Consistent with the relevance of clonal neo- 
antigens, this mutation was found to be present 
in 100% of cancer cells within the sequenced tu- 
mor (Fig. 4D). Similarly, analysis of peripheral 
blood mononuclear cells (PBMCs) from the pa- 
tients with CR9309 and CR0095—melanomas that 
responded to anti-CTLA-4 therapy, resulting in 
prolonged patient survival—identified CD8* T cell 
populations, recognizing tumor-specific neoanti- 
gens (4). In both cases, the neoantigens linked to 
aT cell response were derived from clonal muta- 
tions, predicted to be present in 100% of cancer 
cells (Fig. 4, E and F). 

Previous studies have reported that neoanti- 
gen burden influences sensitivity to immune 
checkpoint blockade in NSCLC and melanoma 
(2, 4, 17). However, the influence of ITH on this 
relationship has not been investigated. Our re- 
sults, although limited by access to small and di- 
verse patient cohorts and single-site biopsy data 
that likely overestimate the number of clonal muta- 
tions, suggest that clonal and subclonal neoantigens 
do not drive equally effective antitumor immu- 
nity. Indeed, using the described approach, de- 
spite screening more than 250 peptides against 
putative subclonal neoantigens, we were only able 
to detect T cells that recognize clonal neoantigens. 
Conceivably, higher-neoantigen ITH may result 
in lower antigen dosage as compared with homo- 
geneous tumors with high clonal neoantigen bur- 
den, thus reducing the chances of identifying 
T cells reactive to subclonal neoantigens. Fur- 
thermore, in cases in which T cells reactive to 
subclonal neoantigens are generated, these will 
be unable to target all tumor cells, limiting over- 
all tumor control. 

The observation that certain anti-CTLA-4 
refractory tumors were enriched for subclonal 
mutations caused by alkylating agents suggests 
that mutations induced by therapy may en- 
hance total neoantigen burden but might not 
elicit an effective antitumor response, possibly 
because of the subclonal nature of the neoanti- 
gens that results from cytotoxic exposure. These 
results highlight the need to consider both the 
antitumor effects of alkylating agents as well 
as the potential risk of inducing subclonal mu- 
tations (19). 

The identification of cytotoxic tumor-infiltrating 
T cells that recognize clonal mutations, shared by 
all tumor cells, might hold promise for adoptive 
therapy strategies to address the challenges of 
ITH (20). The extensive clonal mutational reper- 
toire present in smoking-associated NSCLC (8, 21) 
could render this disease vulnerable to vaccina- 
tion or T cell therapies targeting multiple clonal 
neoantigens, in combination with appropriate 
immune checkpoint modulation. 
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Fig. 3. Identification and characterization of tumor-infiltrating neoantigen-reactive CD8* T cells 
in early-stage NSCLC. (A) MHC-multimer analysis of nonexpanded, tumor-infiltrating CD8* T lym- 
phocytes isolated from tumor regions 1 to 3 and normal lung tissue of patient LO11 identifies CD8* TILs 
reactive to mutant MTFR2 peptide. (B) MHC-multimer analysis of nonexpanded, tumor-infiltrating CD8* 
T lymphocytes isolated from tumor regions 1 to 3 and normal lung tissue of patient LO12 identifies two 
distinct populations of CD8* TILs reactive to mutant CHTF18 and MYADM peptide. The frequency of CD8* 
MHC-multimer—positive cells out of total CD3*CD8* TILs is displayed for (A) and (B). (C) Multiparametric 
flow cytometric analysis of tumor-infiltrating T lymphocyte subsets isolated from LO11 region 3. Pheno- 
typic data are representative of all tumor regions. Relative expression of iCTLA-4 (intracellular CTLA-4), 
surface PD-1, and surface LAG-3 by CD4*FoxP3* (regulatory Tcell), CD4*FoxP3” (CD4 helper T cell), CD8* 
multimer—negative, and CD8* multimer-reactive (CD8* MTFR2"*) Tcells is displayed, plotted against iKi67 
(intracellular Ki67). (D) Coexpression of PD-1 and iGzmB (intracellular granzyme B) by tumor-infiltrating 
T lymphocyte subsets isolated from LO11 region 3. 
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Fig. 4. Neoantigen clonal ar- 
chitecture and clinical benefit 
of immune checkpoint block- 
ade. (A) Samples are grouped 
according to clinical benefit, 
with durable clinical benefit on 
left and no durable benefit on 
right [defined as in (2)]. Bar 
plot depicts clonal neoantigens 
in blue and subclonal neoanti- 
gens in red. Mutational sig- 
natures identified within each 
sample, subtype, and expres- 
sion of PD-L1 are shown below. 
(B) Progression-free survival in 
NSCLC (2) cohort treated with 
antibody to PD1 either without 
an ITH threshold [HR = 0.29 
(0.12-0.69), log-rank P = 0.0032] 
or with an ITH threshold of 0.01 
[HR = 0.20 (0.07—0.60), log-rank 
P = 0.0017], 0.02 [HR = 0.25 
(009-067), log-rank P = 0.0034}, 
or 0.05 [HR = 0.17 (0.07-0.44), 
log-rank P = 0.000061]. (C) Over- 
all survival in melanoma (4) co- 
hort treated with antibody to 
CTLA-4 either without an ITH 
threshold [HR = 0.51 (0.23- 
1.11), P = 0.083] or with an ITH 
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threshold of 0.01 [HR = 0.29 
(0.11-0.77), log-rank P = 0.008], 
0.02 [HR = 0.34 (0.14-0.81), 
log-rank P = 0.011], or 0.05 
[HR = 0.51 (0.23-111), P= 0.083]. 
An ITH threshold of 0.05 re- 
sults in the same survival curve 
as no ITH threshold because 


no tumors with a high neoantigen burden exhibit >0.05 neoantigen ITH. (D to F) Clonal architecture of (D) CA9903, (E) CR9306, and (F) CROO95, with mutations 
yielding neoantigens that elicit a Tcell response highlighted. Blue dots represent clonal mutations, with subclonal mutations depicted as red dots. Density plots are 
shown above. 
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TRANSPORTER FUNCTION 


Direct observation of proton pumping 
by a eukaryotic P-type ATPase 


Salome Veshaguri,””*** Sune M. Christensen,””*** Gerdi C. Kemmer,* Garima Ghale,””*** 
Mads P. Meller,!?->* Christina Lohr,’*** Andreas L. Christensen,?”** Bo H. Justesen,” 
Ida L. Jorgensen,” Jiirgen Schiller,® Nikos S. Hatzakis,»”** Michael Grabe,’ 

Thomas Giinther Pomorski,” Dimitrios Stamou””***+ 


In eukaryotes, P-type adenosine triphosphatases (ATPases) generate the plasma 
membrane potential and drive secondary transport systems; however, despite their 
importance, their regulation remains poorly understood. We monitored at the single- 
molecule level the activity of the prototypic proton-pumping P-type ATPase Arabidopsis 
thaliana isoform 2 (AHA2). Our measurements, combined with a physical nonequilibrium 
model of vesicle acidification, revealed that pumping is stochastically interrupted by long- 
lived (~100 seconds) inactive or leaky states. Allosteric regulation by pH gradients 
modulated the switch between these states but not the pumping or leakage rates. The 
autoinhibitory regulatory domain of AHA2 reduced the intrinsic pumping rates but 
increased the dwell time in the active pumping state. We anticipate that similar functional 
dynamics underlie the operation and regulation of many other active transporters. 


lectrochemical gradients across cellular mem- 
branes control many essential biological pro- 
cesses. These gradients are generated by 
primary active transporters and are used to 
drive the exchange of other solutes through 
secondary active transporters and to facilitate 
signaling through ion channels (7). Patch clamp 
recording has made it possible to observe the 
functional dynamics of single ion channels, re- 
vealing discrete on and off states, subconduc- 
tance states, and other mechanistically important 
features that macroscopic experiments cannot 
probe (2). However, despite extensive structural 
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and biochemical efforts (3), we currently lack a 
similar depth of understanding of transporters, 
because they in general do not produce electrical- 
ly detectable single-molecule transport signals 
(4-8). We monitored at the single-molecule level 
the functional dynamics of a eukaryotic primary 
active transporter, Arabidopsis thaliana H*- 
adenosine triphosphatase (ATPase) isoform 2 
(AHA, referred to as the proton pump), which is 
responsible for energizing the plasma membrane 
of plants and fungi (figs. S1 and S2) (3, 9). This 
provided insights into how the activity of P-type 
ATPases is modulated by autoregulatory terminal 
domains (R domains) and pH gradients (JO, 11). 

We used total internal reflection fluorescence 
(TIRF) microscopy to image with high through- 
put single nanoscopic lipid vesicles tethered to a 
solid support (Fig. 1, A and B, and figs. S3 and 
$4). Tethering was accomplished with a biotin/ 
neutravidin protocol (72), which maintains the na- 
tive function and diffusivity of reconstituted trans- 
membrane proteins (13) and the vesicles’ spherical 
morphology (/4) and low passive ion permeability 
(15). The fluorescence intensity of all single vesicles 
was quantitatively converted to pH (fig. S5) and 
tracked over periods of up to 30 min. 

Initial studies were carried out on the well- 
studied activated form of AHA2, which lacks the 
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flexible C-terminal autoinhibitory R domain (AHA2") 
(Fig. 1A and figs. S1 to $3) (9). Initialization of H* 
pumping into the vesicle lumen was triggered by 
the addition of ATP and Mg”*, which are non- 
membrane-permeable and thus only activate pro- 
ton pumps with an outward-facing ATP-binding 
domain (Fig. 1A) (72). Consistent with this, we 
never observed lumenal alkalinization (Fig. 1C). 
Acidification kinetics reached a plateau of well- 
defined pH (ApHynax) as a result of a dynamic 
steady state, in which active pumping (influx) 
of protons matched the passive leakage (efflux) 
of protons through the membrane due to the 
buildup of a proton motive force (16). As expected, 
addition of the protonophore CCCP collapsed the 
H* gradients (Fig. 1C), whereas controls performed 
without Mg**, ATP, or AHA2® showed no response 
(fig. S6D). Furthermore, the activity of the pump 
was blocked by the addition of the specific in- 
hibitor vanadate (11), and it decayed after ATP 
and Mg”* were flushed out (fig. $7). To control 
for potential artifacts arising from the surface 
tethering of vesicles, we performed a side-by-side 
comparison with vesicles suspended in solution, 
which proved indistinguishable within experi- 
mental uncertainties (Fig. 1C and fig. S6). Taken 
together, these results demonstrate that we were 
able to observe the AHA2®-mediated and ATP- 
fueled pumping of protons against their concen- 
tration gradient into the lumen of single vesicles. 
The single-vesicle experiments revealed a heter- 
ogeneity of acidification rates and ApH, values 
between vesicles (Fig. 1C) that remain masked in 
the ensemble averages (16). 

At the low protein-to-lipid molar ratio (1:12,000) 
used in our experiments, 84% of vesicles exhibited 
no detectable pH changes (Fig. 1C and Fig. 2A, 
top trace) indicating the absence of active pumps 
and thus suggesting that there are only a few 
active pumps in each of the remaining vesicles 
whose pH changed over time (hereafter termed 
active vesicles). We inspected the pH changes in the 
16% of active vesicles and indeed found that all of 
them exhibited the hallmark of single-molecule 
behavior; i.e., stochastic changes between discrete 
states (Fig. 2A). Because the passive leakage rates 
of the vesicles are constant over time (fig. S10), 
these data demonstrate that the individual proton 
pumps are stochastically transitioning between 
active and inactive states. This behavior is termed 
functional dynamics (17-24) and is key to the func- 
tion and regulation of ion channels (25). 

Further examination of all active vesicle traces 
revealed that ~60% of them reverted back to the 
zero ApH baseline after switching off, strongly 
suggesting the presence of only one molecule, 
because it is improbable for many molecules to 
switch off simultaneously (Fig. 2A). In the re- 
maining traces (~40%), we observed two or three 
discrete plateaus, a feature that has been observed 
in all studies of single channels to date and has 
been interpreted to demonstrate that the activity 
of multiple single molecules can be discretely re- 
solved. The latter conclusion was further supported 
by experiments in which titration of the protein- 
to-lipid ratio modulated the percentage of multiple 
plateaus (fig. S5G), excluding the possibility that 
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multiple plateaus represent multiple single-molecule 
activity states. These observations allowed us to 
unambiguously identify the traces resulting from 
a single active proton pump, which we then se- 
lected for further analysis. The activity of single 
proton pumps was amplified and reported by 
~10° pH-sensitive fluorophores (figs. S3 and S4) 
(16), circumventing the issue of photobleaching 
that fundamentally restricts most fluorescence 
studies of single molecules. 

Dynamic transitions between active and inac- 
tive states were also observed in experiments with 
wild-type AHA2 (Fig. 2C), demonstrating that 
they are not solely a property of the truncated 
version. Here ~80% of all vesicles were inactive, 
whereas ~73% of those that showed activity had 
a single plateau indicating a single molecule (Fig. 
2D). With wild-type AHA2, we succeeded in using 
a SNAP-tag to fluorescently label the protein 
and count directly the number of proteins per 
vesicle (Fig. 2E). This allowed us to observe 
activity dynamics and directly count the number 
of labeled proteins at the same time on the same 
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vesicles (Fig. 2C and 2E). We then estimated the 
labeling efficiency and the probability that a pro- 
ton pump was active (12). We were thus able to 
quantitatively convert the bleach-step distribution 
to a distribution of active molecules per vesicle 
and demonstrated that 70 + 15% of active pro- 
teoliposomes carried one active molecule (Fig. 2F). 
This was in quantitative agreement with the dis- 
tribution of activity plateaus (~73%) (Fig. 2D), pro- 
viding an additional demonstration that we can 
resolve and record the functional dynamics of the 
proton pump at the single-molecule level. 

The activity of the proton pump, and probably 
other active transporters, is thus not constant in 
time (Fig. 2). Therefore, for transporters (just like 
ion channels), the rates measured in macroscopic 
experiments are the product of the active-state 
probability and the intrinsic pumping rate. To 
quantitatively analyze the kinetics and dynamics 
of pumping, we constructed a physical model of 
a single vesicle (12), which accounts for several 
parameters that affect the acidification kinetics, 
including passive and active ionic fluxes across 


pH < pH, 


pH = pH, 


Outward ( 
oriented AHA2 
ATP and Mg** CCCP 


| } 


} Inward 
’ oriented AHA2 


400 
Time (s) 


Fig. 1. Imaging proton pumping into the lumen of single surface-tethered vesicles using TIRF 
microscopy. (A) Illustration of AHA2 reconstituted vesicles tethered to a passivated glass surface and 
imaged on and individual basis with TIRF microscopy. Zoom: Extravesicular addition of both ATP and Mg** 
activated exclusively outward-facing AHA2 molecules, triggering H* pumping in the vesicle lumen. We 
quantified changes in the vesicular H* concentration by calibrating the response of the lipid-conjugated 
pH-sensitive fluorophore pHrodo. Valinomycin was always present to mediate K*/H* exchange and 
prevent the buildup of a transmembrane electrical potential. (B) TIRF image of single vesicles tethered ona 
passivated glass slide. (C) Acidification kinetics of single vesicles upon addition of ATP and Mg?*. Red 
traces highlight three representative signals from single vesicles, showcasing the absence of transport 
activity, the continuous pumping of protons, and fluctuations in proton-transport activity. The black trace is 
the average of 600 single-vesicle traces. As expected, addition of the protonophore CCCP collapsed the 


proton gradient established by AHA2®. 
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the membrane, proton buffering in the lumen, 
vesicle size, and buildup of membrane potential 
(Fig. 3A) (26). Proton pumping is modeled with a 
fixed rate (Jp), a lifetime (¢,,,), and time between 
pumping events (¢¢;) Fig. 3, A and B. The vesicle 
is assumed to have a passive membrane perme- 
ability to protons (Pi-a,), which is constant over 
time, as revealed by control experiments (fig. S10). 
The stochastic switching of the pump between 
active and inactive states was extracted directly 
from the traces and used as time-dependent in- 
put to the model. The model is constrained to fit 
the entire trace, and it provides a realistic de- 
scription of the full electrochemical gradient and 
a direct estimation of the absolute numbers of 
pumped and leaked ions. 

Initially, all experimental traces were fit with 
the model by varying two parameters: Jp and 
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Preax- This provided a good quantitative descrip- 
tion of the majority of AHA2® traces (~65% of 
126 counts, fig. S8); however, it systematically 
underestimated the observed leaking rates for 
the remaining traces (Fig. 3B, blue line), suggest- 
ing the existence of an additional proton-leaking 
route apart from passive leakage through the 
membrane (fig. S8). Indeed, leakage of ions through 
transporters has been reported before; e.g., for P- 
and V-type ATPases (27, 28). To test this hypoth- 
esis, we collected all lifetimes of exponential fits 
to the leakage kinetics from traces transitioning 
between active and inactive states. The histo- 
gram of leakage lifetimes (fig. S9C) had two 
clearly separated peaks: one that according to 
control experiments corresponded to passive leak- 
age through the membrane (a transmembrane 
leak) and another that was approximately 20 
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times faster (figs. S9 and S10). The latter peak 
was specifically inhibited by the addition of 
vanadate, which locked the pump in the E2 state 
(1D) (fig. S9D), demonstrating that the leak is not 
passively mediated by the membrane (or the protein/ 
membrane interface) but by the pump itself. Be- 
cause vanadate is membrane-impermeable, we 
can exclude the possibility that the fast-leak com- 
ponent originated from pumps with the op- 
posite orientation, because they would not be 
blocked by vanadate. We thus modified the mod- 
el to include a time-dependent transprotein pro- 
ton leak (Pays), Which turns on once pumping 
stops and turns off at the beginning of the next 
pumping cycle (Fig. 3C, blue dotted line). As ex- 
pected, the revised model considerably improved 
the fits of the remaining traces (Fig. 3B, red line, 
and fig. S8). 
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Fig. 2. Single-molecule observation of proton pumping reveals active and 
inactive states. (A) Typical examples of pH changes inside individual AHA2® 
reconstituted vesicles. ATP and Mg** (2 mM) were added to initiate proton 
pumping, and CCCP (5 uM) was added to collapse the pH gradients. Traces 
show —ApH defined as a difference between the initial and final pH. Images of 
each respective liposome at different time points are shown below each trace. 
At the right-hand side of the traces, we plotted histograms of pH plateaus 
numbered to indicate the number of active pumps per vesicle. The pH inside 
the majority of vesicles showed no changes indicating the absence of func- 
tional transporter molecules (top panel). For the majority of active vesicles, we 
observed intermittent H* pumping, indicating the presence of single molecules 
(middle panels). The observation of two discrete steady-state pH plateaus in 
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single-vesicle traces indicated the occasional presence of two active pumps 
per single vesicle (bottom panel). (B) Population histogram of pH plateaus for 
AHA2"-reconstituted vesicles (n = 3, where hereafter n is the number of 
independent experiments). (C and D) Same as in (A) and (B) but for full-length 
AHA2. For (D), n = 2. Labeling of AHA2 with Alexa Fluor 647 enabled counting 
on the same vesicles of both the number of labeled AHA2 proteins (E) and of 
the respective activity dynamics (C). (F) The histogram of active proteins per 
vesicle was calculated from step-bleaching analysis of the data in (E) that was 
corrected for labeling efficiency and the probability that a proton pump is active 
(12). The two independent methods for estimating the number of active mol- 
ecules agreed that ~70% of vesicles containing a protein have one active 
proton pump. 
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Fig. 3. Modeling active, inactive, and leaky states and their role in auto- 
inhibiting proton pumps. (A) The main parameters of the physical model we 
used to fit changes in the vesicular pH were pumping rate (/,), protein- 
associated leak (Paya2), membrane leak (Pieax), Valinomycin-induced K* per- 
meability (Px), buffering capacity in the interior of the vesicle (B), and electrical 
potential across the membrane (¥) (12). (B) Example of a typical proton 
pumping trace and respective fits without (blue) and with (red) a transprotein 
proton leak. A threshold in the first derivative of the pH kinetics (12) was used to 
define the lifetime of the active state t,, and the time between pumping events 
tor. (C) Temporal evolution of the proton pumping rate (gray) and the proton 
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efflux rates due to passive membrane (red) leakage and transprotein backflow 
(blue) for the pH trace shown in (B). (D) Histogram of proton permeability 
associated with the membrane, AHA2, and AHA2®. Respective counts were 95, 
37, and 45. (E) Histogram of pumping rates for AHA2® and AHA2. Respective 
counts were 126 and 95. (F) Histogram of ton for AHA2" and AHAZ. Respective 
counts were 241 and 134. The bar at >1200 s shows the number of traces that 
did not switch in the duration of the experiment. (G) Histogram of tor for 
AHA2® and AHA2. Respective counts were 69 and 39. For AHA2® and AHA2, 
respectively, the number of independent experiments was 3 and 2 and the 
number of individual proteoliposomes analyzed was 126 and 95. 
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respectively, where uncertainties represent 95% confidence intervals from the fits. (C) Probability of observing a transprotein leak as a function of pH gradient. For 
AHA2® and AHA2, data were binned with 0.25 and 0.5 pH units; the number of independent experiments and individual proteoliposomes analyzed was the same 
as in Fig. 3. Spearman's rank order correlation coefficients p (126) = 0.40, P =10~“, and p (95) = 0.30, P = 0.03 indicated a strong positive correlation between 


leakage probability and ADHmax for both AHA2® and AHA2. 
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Next, we quantitated proton permeabilities by 
fitting the kinetics with the model. The average 
transmembrane leak Pi. (Fig. 3D and fig. S11) 
was ~7 x 10°° cm/s, which is in line with previous 
measurements and estimates (26), and the aver- 
age transprotein leak Papas had a similar value 
(~46 x 10° cm/s) (Fig. 3D). However, when 
normalized for surface area, the transprotein pro- 
ton current was greater than the transmembrane 
by a factor of ~10*. 

The inhibitory R domain of AHA2 has been 
shown to reduce the net macroscopic proton trans- 
port rate by ~twofold (10, 11). In order to elu- 
cidate the mechanisms underlying this regulation, 
we characterized the activity of the proton pump 
with and without the R domain. Counterintuitively, 
the autoinhibitory R domain increased the total 
time the transporter spent in the active state, both 
by increasing ¢,, ~threefold (from 337 to 951 s, P = 
10”) and by decreasing to ~0.5-fold (from 121 
to 65 s, P = 0.05) (decay constants of exponential 
fits to the distributions in Fig. 3, F and G; unless 
otherwise stated, P is a Kolmogorov-Smirnoff test 
of statistical similarity between two distributions). 
Thus, the probability of finding the pump in an 
active state Pon = ton/(ton + tore) Increased ~200% 
for AHA2 (from 0.35 + 0.05 to 0.76 + 0.06). Im- 
portantly, 100% of AHA2® and ~60% of AHA2 
molecules switched on/off during our observation 
period, highlighting the fact that functional dyna- 
mics is a dominant property of this system (Fig. 
3F). The R domain also had a pronounced effect 
on the overall intrinsic transport rates of the 
pump, which were reduced by ~10-fold as com- 
pared to AHA2® (from 928 to 85 protons/s, aver- 
age values, P = 10°°°) (Fig. 3E). In addition, the 
R domain promoted an overall decrease in the 
transprotein leak (~1.4-fold, P = 0.005) (Fig. 3D). 

The activity of the pump was also regulated by 
the pH gradients established across the membrane 
during proton pumping. Increasing ApHynax de- 
creased by >twofold the lifetime of the active 
state, but only for the wild type (Fig. 4A, B). This 
regulation seems to be transmitted allosterically 
across the bilayer, because the R domain of AHA2 
is facing the vesicle exterior, where the pH remains 
constant. In addition, traces with larger ADHnax 
had a dramatic eightfold increase (from 0.1 to 0.8) 
in the probability of a transprotein leak for both 
forms of AHA2 (Fig. 4C). Thus, regulation by pH 
gradients can manifest through two mechanisti- 
cally distinct processes that reduce the net aver- 
age proton transport: reduction of the pumping 
lifetime and increase of the probability of a trans- 
protein leak, whereby only the former is encoded 
in the R domain. 

Our observations of proton transport and leak- 
age dynamics at the single-molecule level also 
provide critical insights into the ATP/H" stoichi- 
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ometry (27, 28). Ensemble average experiments 
have reported that the buildup of pH gradients 
can in general alter the stoichiometry of transport 
and therefore pumping rates (27, 28). Contrary to 
expectation, we found that the intrinsic (single- 
molecule) pumping rate remained constant for 
gradients as large as 2 pH units (Fig. 3, B and C, 
and fig. S12C). As discussed above, pH gradients 
did reduce the net proton transport, but primar- 
ily by increasing the probability of a downhill 
transprotein leak (Fig. 4C). However, because the 
transprotein leak takes place once the pump has 
switched to the inactive state (Fig. 3C, S9), it does 
not affect the actual stoichiometry of active trans- 
port. In contrast, the R domain reduced the in- 
trinsic pumping rates by ~10-fold (Fig. 4E). Because 
the R domain does not significantly affect the en- 
semble average ATPase activity (10, 29), our mea- 
surements suggest that the R domain can reduce 
the stoichiometry of active transport by a factor of 
~10 (or 20 if we correct for the change in P,,) (1D. 
Finally, we note that our measurements of proton 
transport were integrated over thousands of Post- 
Albers catalytic cycles per second per single mol- 
ecule. A better mechanistic understanding of these 
processes would ultimately require direct measure- 
ment of the stoichiometry at the level of single 
turnover cycles or careful molecular simulations. 
We have developed a technique to observe, in a 
highly parallel manner, uphill substrate transport 
mediated by single transporter molecules into 
single nanoscopic lipid vesicles. Our measure- 
ments revealed the existence and the dynamics 
of several distinct functional states (active, inactive, 
and leaky) that together defined the activity and 
regulation of the proton pump, and that, we anti- 
cipate, underlie the operation of many other 
primary and secondary active transporters. The 
assays introduced here render these processes 
accessible to direct experimental observation. 
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The Effect of Priming Intellectual Virtues 
on Individual Effort and Understanding 


TECHNOLOGY, ENGINEERING, AND MATH 
Winner: Sangwon Hyun, 

Carnegie Mellon University 
Epidemiological Forecasting 

with Statistical Models 


Honorable Mention: Aniurka Duverge 
Carreno, Howard University 
Exploring Different Configurations 

of Wire Ropes Supporting Slender 
Equipment to Minimize 
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Internationalizing 
Japan’s academia: 
Riding the wave 
of change 


Japan's top universities are implementing programs to meet 
the goals of recent multimillion dollar government projects 

to internationalize their campuses and compete in the global 
competition for top class staff and students. By Adarsh Sandhu 


Reforming Japan’s universities 

Surprisingly, in spite of the fact that Japan has produced 
more Nobel laureates than any other country in Asia, there 
were only two Japanese universities in the top 100 of the 2014 
Times Higher Education World University Rankings. By con- 
trast, there were three each from China and South Korea, and 
two from Singapore. These facts are of concern to educators in 
Japan, recently prompting the Ministry of Education, Culture, 
Sports, Science and Technology (MEXT) to launch two 10-year, 
multimillion dollar projects: the Program for Promoting the 
Enhancement of Research Universities (hereafter “RU”) in 2013 
and the Top Global University (TGU) project in 2014. The pur- 
pose of these projects is to support Japanese universities with 
research and education reforms, enabling them to internation- 
alize and better compete globally (7, 2). The RU program aims 
to improve research infrastructure, while the TGU program 
targets internationalization of education. 

Each university received initial annual funding of US$2 mil- 
lion to US$4 million under the RU program and US$1.4 million 
to US$3.5 million under the TGU program. Both have a mid- 


term assessment (after five years), with the possibility of cuts or 
even termination of funding for those institutes that have not 
achieved their initial targets. Details of the original applications 
are posted on the MEXT website, to enhance transparency and 
accountability. 

These programs have generated a lot of interest in both 
Japan and overseas, particularly because the universities and 
institutes selected were chosen by an unprecedented, top-down 
approach based on performance metrics—such as quality of 
publications and patent submissions—and staff-to-student ratios. 

The programs were devised to resolve two major prob- 
lems facing Japanese academia. The first is the dramatic fall 
in Japan’s birth rate, with government figures indicating that 
within 20 years, Japan's 780 universities will have more places 
available than domestic students to fill them. Currently, about 
600,000 students apply for approximately 580,000 places at 
Japan's universities. The projected drop in enrollment will lead 
to mergers or the closure of less competitive institutes. 

The second problem is the need to improve Japan's stand- 
ing in world ranking tables. University rankings are by no 
means the most effective means of assessing the quality of 
research and education in academia, but they do highlight the 
importance of internationally oriented curricula and strategic 
outreach programs to enhance visibility. Improving global 
appeal may also attract international students to counteract 
future enrollment shortfalls. 


Attracting international students and staff 

It is estimated that there are currently around 4 million inter- 
national students in the world, with the United Nations Organi- 
zation for Education, Science and Culture (UNESCO) predict- 
ing that this number may increase to 7 million by 2020. Japan 
currently has about 184,000 overseas students (approximately 
3% of all students in tertiary education) with plans to increase 
this number to 300,000 over the next decade. By contrast, 
from 2013 to 2014, universities in the United States enrolled 
approximately 886,000 international students. Courses taught 
in English, the availability of scholarships, and genuine career 


opportunities upon graduation are just some of the factors that 
attract the world’s best and brightest to these universities. 

Why do Japanese universities struggle to attract interna- 
tional students? The main reason is that the Japanese language 
is a formidable hurdle to overcome. When Joby Joseph, a 
physicist at the Indian Institute of Technology (IIT) Delhi, first 
visited a university in Japan, he was surprised to see that all 
the courses were taught using Japanese language textbooks. 
“There was not a single word of English in the books,” he said. 

To overcome language problems, universities selected for 
the TGU program have changed their curricula to include more 
English language courses and have hired bilingual support and 
administration staff to assist international visitors. Importantly, 
they have also started to send their own students abroad for 
short-term stays to hone their English language skills and expe- 
rience different ways of conducting research and education. 


Career opportunities at Japan’s top universities 

Universities in Japan are now more open and dynamic in 
their ideas than they have been in the last 50 years, offering 
real opportunities for tenured positions for staff from overseas. 
The drive by Japan's top universities to recruit international 
staff and students is an unprecedented opportunity for career 
development for young students and faculty members from 
outside Japan wishing to collaborate with Japanese researchers. 

All scientists, from graduate students to established 
academics, are encouraged to seek research opportunities 
in Japan. Prospective graduate students can research labo- 
ratories at Japan’s top universities through their websites to 
find academics doing research in their area of interest. They 
can then contact the faculty directly via e-mail to ask about 
research openings and the possibility of financial support. It 
is worth noting that graduate school programs start in either 
April or October, but the precise dates for formal interviews 
and examinations depend on the institute. For academics, 
direct networking at conferences and similar events is critical 
for building relationships that may lead to finding openings at 
Japanese universities. 

For international scientists, the location of an institute can 
be important. Prospective visitors should decide whether they 
would prefer a megametropolis such as Tokyo or Osaka, or less 
crowded cities such as Kanazawa and Beppu. The availability 
of English language schools for children, and of local commu- 
nities and support networks for spouses, are also important 
factors to consider for long-term stays in Japan. 

Visitors to Japan are often surprised by the safety of its 
cities. Personal safety and peace of mind are some of the at- 
tractions of studying or working in Japan. Another is cost, with 
tuition fees at Japan’s national universities considerably lower 
than universities in the United States and certain European 
countries. This is an important factor when choosing universi- 
ties, given the increasing number of students needing loans 
to pay for their university education in the United States and 
United Kingdom. 

Many opportunities exist for short-term scholarly research 
in Japan, as many visiting academics will testify. For example, a 
three-to-nine month sabbatical can enable young researchers 
to form connections for future collaborations. However, there 
are challenges filling full-time, tenured positions at Japanese 
universities because of severe gaps in matching the applicants’ 
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expectations with the realities of low salaries, heavy teaching 
obligations, and insufficient startup funding for setting up 
research labs. Furthermore, even if such issues are resolved, 
people with families may find it difficult to balance their careers 
with their children’s education because of the lack of reason- 
ably priced international schools in Japan. 


Universities in Japan 
are now more open and dynamic 
in their ideas than they have been 
in the last 50 years, offering real 
opportunities for tenured positions 
for staff from overseas. 


Surviving Japan’s educational reforms 

Japan's academia is going through changes on par with those 
of the Meiji Restoration in the late 19th century. The Meiji move- 
ment laid the foundations for the current educational system, 
devised by Japanese scholars for Japanese students. But a 
Japanese-only study environment and curriculum has produced 
students who are unable to function effectively in English—the 
de facto global language of scholarly communication. Mastery 
of English is particularly important for Japanese industry, which 
is becoming increasingly internationalized and needs globally 
minded employees to run its overseas operations. 

Importantly, the outcome of these two high-profile MEXT 
programs will have significant implications for the future of 
education and research policy in Japan. The next 5 to 10 years 
will likely see both mergers and closures of universities large 
and small that are unable to attract enough students and fund- 
ing as government subsidies are reduced. 

Japan’s university administrators face unprecedented and 
multifaceted problems as they struggle to cope with increased 
international competition for students and research staff. 
Finding solutions to these issues will require dynamic, diverse, 
and global approaches. The academic institutes that survive 
this wave of educational reform in Japan will be those that 
implement bold initiatives to create borderless, multicultural, 
multilingual, and globally connected campuses. 


Adarsh Sandhu is a freelance science writer based in Tokyo. 
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Shuji Hashimoto 


Waseda University 
goes global 


aseda University is a top, private academic institu- 
tion in Japan in terms of prestige, scholarly achievements, 
and financial capability. The university is rapidly implementing 
innovative programs to create a worldwide academic network 
for an open, diverse, and dynamic campus. 

“We have set two challenging and ambitious goals for 
the next 10 years under the Waseda Goes Global (W2G) 
plan,” says Vice President Shuji Hashimoto. “Training 100,000 
students for global leadership through research and education 
programs with international academic partners, and being 
ranked among the world’s top 100 universities in 18 different 
areas of research.” 

Waseda’s goals are supported by funding from the 
government's Top Global University Project. Waseda was 
one of only 13 universities selected by Japan's Ministry of 
Education, Culture, Sports, Science and Technology (MEXT) 
in the highly competitive Type A group when funding was 
launched in September 2014. The aim of the project is to help 
selected universities achieve reforms that will enable them to 
compete globally. 

W2G is a ten-year plan, with approximately ¥540 million 
(US$4.7 million) allocated for the first full fiscal year, 2015. 
The university has demonstrated its commitment to achieving 
dramatic results by dedicating ¥190 million (US$1.7 million) 
of its own funds on top of the ¥350 million (US$3 million) 
government grant. 

The funding is being used to support six education and 
research units prioritized by Waseda to implement the goals 
of the W2G plan. The six units are Global Japanese Studies; 
Empirical Analyses of Political Economy; Health Promotion: 
The Joy of Sports and Exercise; Frontier of Embodiment Infor- 
matics: Information and Communication Technology (ICT) and 
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Robotics; Energy and Nanomaterials; and Multiscale Analysis, 
Modeling, and Simulation. Specific goals of the W2G plan 
include doubling the number of visiting researchers from 
810 to 1,600 and increasing the international faculty from 
760 to 1,380. 

“We are also making major changes to our personnel 
recruitment policies to achieve the aims of the W2G plan,” 
explains Hashimoto. “We are actively recruiting faculty from 
overseas, increasing the number of tenure-track positions 
and joint appointments.” 

Furthermore, Waseda is introducing an academic calen- 
dar based on quarters to synchronize with those of overseas 
universities. The university is also creating joint appointment 
chairs and cotutoring for double degree programs, as well 
as giving full support to visiting scholars and students study- 
ing under the Top Global University Project. The aim of these 
changes is to support students while they are at Waseda, and 
to ensure they have priority access to accommodation on or 
close to campus. 

Hashimoto is confident that Waseda will succeed in achiev- 
ing its goals and will act as a model for other universities 
with similar ambitions. "Waseda is truly going global. We 
welcome highly motivated students and researchers from 
around the world to join us on this unique journey." 


Waseda University at a glance 


® Founded in 1882 by Shigenobu 
Okuma, who twice served as 
prime minister of Japan, Waseda 
University is one of the largest and 
most influential private universities 
in Japan, with more than 50,000 
students and 5,500 academic staff. 
@ Waseda is Japan's foremost 
international university with more 
than 5,000 international students 
and more than 700 overseas part- 
ner institutions in 81 countries. 

@ International students can 
choose from all-English degree 
programs in 6 undergraduate and 
12 graduate schools, and learn 
over 25 foreign languages; the curriculum for Japanese language 
training is Japan’s most extensive. 

@ Waseda’s main campus is located in central Tokyo, with 
convenient access to shopping areas, international schools, 
embassies, and offices of major corporations. 

@ The university has a global network of approximately 600,000 
alumni, including novelist Haruki Murakami, UNIQLO founder and 
CEO Tadashi Yanai, Olympic gold medalist Shizuka Arakawa, and 
seven Japanese prime ministers. 

@ Waseda University was ranked 33rd in the world and ‘st in 
Japan in the Quacquarelli Symonds graduate employability 
rankings published in November 2015. 


Okuma:Auditorium 


FURTHER INFORMATION 


Waseda University 
www.waseda.jp/top/en 


Top Global University Project 
www.waseda.jp/inst/sgu/en 


Energy and 
nanomaterials 


Searching for innovative means 
to produce, store, and use energy 


he Unit for Energy and Nanomaterials is one of the 
flagship research hubs established by Waseda University as 
part of the Top Global University (TGU) project funded by 
Japan's Ministry of Education, Culture, Sports, Science and 
Technology (MEXT). “We want to contribute to global efforts 
to mitigate the daunting problems related to the produc- 
tion, storage, and efficient use of energy,” explains Hiroyuki 
Nishide, former dean of the Graduate School of Advanced 
Science and Engineering, and coordinator of the unit. “We 
have launched several innovative international initiatives 
to facilitate this goal, including hiring academic staff and 
researchers from overseas. We welcome researchers and 
students from overseas to join us.” 


Unit for Energy and Nanomaterials 

Research and education within the Unit for Energy and 
Nanomaterials are driven by international collaborations, 
with Waseda funding joint appointments of overseas staff 
as well as student exchanges as part of a joint Ph.D. degree 
program between a group of partner universities. In 2016, 
Waseda University will hire full professors from universities 
in the U.S. and Australia. The new staff will have one-to- 
three-year fixed contracts and their duties will include 
teaching and research. Published work in peer-reviewed 
journals will carry the names of both universities. Salaries 
are competitive and the staff from the partner universities 
stay at Waseda for a minimum of three months per aca- 
demic year. 


ADVERTISEMENT 


Regular interuniversity exchanges and close supervision of 
students are important for the program’s success. Students are 
carefully matched with supervisors from the partner universi- 
ties and are required to take designated course credits at the 
partner university and to conduct research there for more than 
three months. Students successfully completing the program 
will receive a certificate under the name of Waseda and the 
partner university. 

“Compared with students who undertake doctoral courses 
at only a single institution, these students will benefit by inter- 
acting with top researchers at two universities located in differ- 
ent countries,” says Nishide. 


Innovative world-class research on batteries 
Hiroyuki Nishide is internationally renowned for the develop- 
ment of "radical polymer” batteries that are semitransparent, 
flexible to the extent of being foldable, and can be charged 
in less than 30 seconds. “We use so-called p- and n-type re- 
dox couples of radical polymers in our flexible batteries,” says 
Nishide. “For example, we tune radical polymers with a wide 
selection of molecular structures, enabling these materials to 
exhibit the properties of both cathode and anode materials. Pro- 
totypes show very promising properties such as rapid charging 
and a long shelf life.” 

Specifically, 0.5-mm-thick radical polymer batteries with areas 
of 3 cm? exhibit a battery capacity of 6 milliamp hours and a 
power density of 5 kilowatts/liter. These groundbreaking batter- 
ies are expected to find applications in consumer products such 
as smart wrist watches. 


Research institutions collaborating with 
the Unit for Energy and Nanomaterials 


The Smart Energy System 
Innovation Center 

This center for battery 
innovation develops 

high capacity secondary 
batteries including 

lithium sulfide and metal 
air types. Other work 
includes nondestructive 
monitoring of batteries and 
the development of silicon 
anode batteries. 


Energy Management System 

Shinjuku R & D Center 

This center focuses on research into next-generation energy 
management system responses and the development of 
next-generation voltage regulation technology in power net- 


works, including power generated by renewable energy. 


FURTHER INFORMATION 


Unit for Energy and Nanomaterials 
www.tgu-enm.sci.waseda.ac.jp 


Graduate Program in Science and Engineering 
www.leading-en.sci.waseda.ac.jp/en 


ADVERTISEMENT 


Frontiers of embodiment 
informatics: 


Combining communications and 
engineering for human symbiotic 
robots 


R elationships between people are fraught with difficulties. So 
imagine the challenges researchers face in forming mutually beneficial 
relationships between humans and robots. Shigeki Sugano has spent a 
lifetime of research doing just that: formulating ways of enabling robots 
and humans to interact with and understand each other. 

“The fundamental question is whether humans and robots can coexist, 
and work together closely to achieve goals,” says Sugano. "The answer 
is important for the development of robots to support human activities, 
particularly in aging societies such as in Japan.” 

Sugano and colleagues have defined safety, dependability, and dexter- 
ity as critical factors for humans and robots to be able to coexist. Exten- 
sive research based on instilling these traits in robots led to the develop- 
ment of TWENDY-ONE-a highly dexterous and multifunctional human 
symbiotic robot able to adapt to human movement and manipulate 
objects, such as putting bread into a toaster. 

“The Graduate Program for Embodiment Informatics is funded by 
Japan's Ministry of Education, Culture, Sports, Science and Technology 
[MEXT] as one of its Leading Graduate School Programs. It aims to nurture 
students to meet the formidable challenges for realizing the perfect hu- 
man symbiotic robot,” explains Sugano. “We welcome highly motivated 
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students and researchers from 
overseas to join us for firsthand 
insights into Japan's monozukuri 
or manufacturing technology.” 

The MEXT funding for the Grad- 
uate Program for Embodiment 
Informatics reflects the long and 
distinguished history of robotics 
research at Waseda University. Su- 
gano is following in the footsteps 
of his mentors Ichiro Kato and Kat- 
suhiko Shirai, who combined their 
talents in mechanical and electri- 
cal engineering to develop WABOT-1, the world’s first humanoid robot, 
in 1973. "It would not be an exaggeration to say that we are standing on 
the foundations laid by Professors Kato and Shirai,” says Sugano. 

In the program, students take on projects to build human symbiotic ro- 
bots by integrating research on mechanical technology and information 
communications technology. The Kobo Workshop—a shared open space 
where students from different backgrounds meet and interact freely—is 
one of the unique facilities of the program. 

Future research themes include the WAMOEBA Project (Waseda- 
Amoeba, Waseda Artificial Mind On Emotion BAse), a methodology for 
instilling emotion and self-preservation instincts into robots. 


Shigeki 


FURTHER INFORMATION 


Shigeki Sugano 

Department of Modern Mechanical Engineering, School of Creative 
Science and Engineering, and coordinator of the Waseda Graduate 
Program for Embodiment Informatics. 


Graduate Program for Embodiment Informatics 
www.leading-sn.waseda.ac.jp/en/ 


Fusing science and 
engineering in real- 
world modeling of 
fluid dynamics 


A ircraft, oil tankers, and nuclear power stations are critical ele- 


ments of modern society. However, the average citizen is probably 
unaware that the design, manufacture, and implementation of these im- 
portant technologies require a deep knowledge of fluid mechanics—how 
mixtures of gases, liquids, and solids move and interact. Furthermore, in 
spite of recent advances in computing and mathematical modeling, our 
understanding of the intricacies of fluid mechanics is still incomplete. 
Against this background, enhancing our knowledge of mathematical 
fluid mechanics is the main goal of the international doctoral program at 
the Research Institute of Nonlinear Partial Differential Equations (PDEs) at 
Waseda University. 

“Our international doctoral program on mathematical fluid mechanics 
is focused on the analysis of real-world fluidics such as bubble forma- 
tion in the cooling systems of nuclear reactors,” says Yoshihiro Shibata, 
Department of Mathematics, Faculty of Science and Engineering and 
Research Institute of Nonlinear PDEs. “The program is part of Waseda 
University’s Top Global University (TGU) project.” 

Examples of research in this program are mathematical analysis of 
multiscale complex phenomena, biofluid mechanics for production of 
biofuels using Euglena microorganisms, solutions to the free boundary 
problem of Navier-Stokes equations, and modeling of cavitation 


phenomena. Notably, the 
decommissioning of Japan's 
nuclear reactor in Fukushima has 
focused attention on multiscale 
complex phenomena, including 
how to understand complicated 
interactions between molten 
fuels and the other structures of 
the reactor. 

The doctoral program is evolv- 
ing into a credit-based system, 
enabling students to transfer cred- 
its to partner institutes in Europe 
and North America. Students will be supervised by faculty members at 
both Waseda and their overseas partners. 

“Currently we are creating long-term frameworks with Darmstadt Uni- 
versity, Germany, the University of Pittsburgh, U.S., and the Universities 
of Pisa and Bari in Italy,” explains Shibata. “We want to nurture young 
minds of all nationalities capable of accurate mathematical modeling and 
simulations based on physics, engineering, and numerical analysis. We 
welcome people from overseas to join us in making physics and math- 
ematics a driving force behind building the infrastructure of our society.” 


FURTHER INFORMATION 


Yoshihiro Shibata 
Department of Mathematics, Faculty of Science and Engineering 
and Research Institute of Nonlinear Partial Differential Equations 


Yoshihiro Shibata’s website 
www. fluid.sci.waseda.ac.jp/shibata/ 


Mathematics and Physics Unit, TGU project, Waseda University 
www.sgu-mathphys.sci.waseda.ac.jp/en/index.htm| 
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SPE Plates 

The Strata-X pElution 96-Well Plates 
enable solid phase extraction (SPE) 
from samples as small as 10 uL, along 
with low elution volumes. These cost- 
effective 96-well plates conserve 
precious samples and reduce time 
requirements compared to traditional 
SPE plates, which require 200-uL elu- 
tion volumes and a dry-down step to 
produce sufficient concentration. This 
dry-down step is time-consuming and 
can reduce analyte recovery of pep- 
tides and thermolabile compounds. 
The new Strata-X Elution 96-well 
plates are ideal for work with biologi- 
cal samples in pharmaceutical, clinical 
research, and forensic laboratories. 
Strata-X pElution 96-well plates fol- 
low the same procedure as traditional 
96-well SPE plates, however elution 
volumes are as low as 25 uL, eliminat- 
ing the need to perform a dry-down 
step. Strata-X pElution 96-well plates 
can be used with a vacuum manifold or 
positive-pressure system and can be 
automated with a robotic liquid handler 
for further time and labor savings. 
Phenomenex 

For info: 310-212-0555 
www.phenomenex.com 


Confocal Microscope 

The HyVolution true confocal super- 
resolution technology for the Leica TCS 
SP8 confocal microscope platform al- 
lows researchers to resolve structures 
down to 140 nm with multiple colors, 
high imaging speed, and high signal-to- 
noise ratio. HyVolution can be ordered 
with every new Leica TCS SP8 and is 
also available as an upgrade to existing 
Leica TCS SP8 platforms. HyVolution is 
based on a combination of the unique 
Leica HyD hybrid detector, a workflow- 
oriented software wizard; best-in-class 
Huygens deconvolution software 
embedded via a direct interface in 

the Leica Application Suite X (LAS X) 
control software; and CUDA graphics 


processing unit-accelerated computing. Researchers benefit from 
HyVolution by imaging fast dynamic processes and capturing mul- 
tiple colors simultaneously with the TCS SP8’s spectral detection 
system, and can still achieve 140-nm resolution. 


Leica Microsystems 
For info: 800-248-0123 
www.leica-microsystems.com 


LIFE SCIENCE TECHNOLOGIES 


Thermal Cycler 

The new qrOWER’ real-time polymerase 
chain reaction (PCR) thermal cycler is a 
patented fiber-optic shuttle system with a 
unique light source powered by four high- 
performance light-emitting diodes (LEDs), 
which guarantees ideal excitation and detec- 
tion of known fluorescent dyes up to the dark 
red. In addition, the system’s highly sensitive 
detection module can be equipped with up 
to six different color modules. Color modules 
are upgradeable, allowing the system to ben- 
efit from future developments. Silver block 
technology is at the heart of the qTOWER®, 
offering outstanding homogeneity of only 
+0.1°C over the entire 96-well block. Option- 
ally, the linear gradient function is the optimal 
tool to easily adjust the instrument to different 
assays. The (TOWER? is available either as a 
stand-alone device with an integrated 10-in. 
tablet or as a PC-based system. The software 
comes with a broad spectrum of optimized 
analysis algorithms, including absolute and 
relative quantification, melt point analysis, 
delta-delta cycle threshold (Ct) method, PCR 
efficiency, allelic discrimination, endpoint, 
and protein determination. 

Analytik Jena 

For info: +49-(0)-36-41-77-70 
www.analytik-jena.de 


Hiden Isochema 


NEW PRODUCTS 


Cancer Stem Cell 

Identification Kit 

The new AldeRed ALDH Detection Kit 
is designed for identifying and isolating 
cancer stem cells. The AldeRed reagent 
is used to label cancer stem cells 

with a red fluorescent dye, making it 
possible to distinguish cancer stem 
cells in live cell populations where 
specific identification was previously 
impossible. Aldehyde dehydrogenase 
(ALDH), a cancer stem cell marker 
enzyme, causes the AldeRed reagent 
to fluoresce in the far-red spectrum, 
allowing the cancer cells to be 
identified and isolated with concurrent 
use of green fluorescent cell lines, 
transgenic animals, and reporter 
assays. Previous ALDH reporters 
exhibited green fluorescence, which 
made it difficult to identify positive 
cells in an otherwise green fluorescent 
background. 

EMD Millipore 

For info: 800-645-5476 
www.emdmillipore.com/stemcells 


Automated Materials Analyzer 
The ABR Automated Breakthrough 
Analyzer is designed to meet the needs 
of researchers wishing to character- 

ize the gas separation performance of 
novel materials, such as metal-organic 
frameworks (MOFs), zeolitic imidazolate 
frameworks (ZIFs), and covalent organic 
frameworks (COFs), without the time or 
expense of synthesizing larger quanti- 
ties of material. Breakthrough curve 
measurement allows high-throughput 
screening and testing of adsorbents, for 
a wide range of separation processes 
including CO, capture and storage, the 
purification and recovery of noble gas- 
es, natural gas and biogas upgrading, 
toxic gas removal, and air separation. 
The ABR is a dedicated breakthrough 
analyzer, fully automated and sup- 

plied with an integrated close-coupled 
quadrupole mass spectrometer. Differ- 
ent configurations are available to suit 


research-scale samples, with bed volumes from 2 cc to 20 cc. Up 
to six gas inlets are available as well as a dedicated purge stream. 
Flow rates are selected to suit the specific applications, and the 
ABR includes an ultralow dead-volume switching valve. 


For info: +44-(0)-1925-244678 


www.hidenisochema.com 
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Don't miss the debut of 
Science Immunology. 


Image: Eraxion / iStockPhoto 


NOW ACCEPTING PAPERS 


Science is expanding its reach into immunology—now Be apart of the Science Immunology debut 
offering the newest online-only, weekly journal in the issue publishing Summer 2016! 

Science family of publications. Science Immunology will 
provide original, peer-reviewed research articles that 
report critical advances in all areas of immunological 
research, including studies that provide insight into the 


human immune response in health and disease. Science Immunology 


MVAAAS 


Submit your manuscript today at 


Sciencelmmunology.org. 


its NEW ENGLAND 


BioLabs... 


Time for 
change. 


Introducing Monarch” 
Nucleic Acid Purification Kits 


It’s time to transform your DNA purification experience. 
NEB’s Monarch Nucleic Acid Purification Kits are optimized 
for maximum performance and minimal environmental 
impact. With an innovative column design, buffer retention 
is prevented, eliminating risk of carryover contamination and 
enabling elution in smaller volumes. The result — highly pure 
DNA for your downstream applications. 


Make the change and migrate to Monarch today. 


Optimized design of Monarch Miniprep Columns 


Labeling tab and 
frosted surface 
provide convenient 
writing spaces 


Made with less r 
plastic for reduced 
environmental impact 


Unique, tapered design 
eliminates buffer carry- 
over and allows for 


Binding capacity 
up to 20 ug 


a dacbiag le Column tip is Request your free sample 
compatible with 


vacuum manifolds at NEBMonarch.com 


NEW ENGLAND BIOLABS” and NEB" are registered trademarks of New England Biolabs, Inc. 
MONARCH"™ is a trademark of New England Biolabs, Inc. 
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Advertising 


For full advertising details, go to 
ScienceCareers.org and click 
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our representatives. 


Tracy Holmes 

Worldwide Associate Director 
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Phone: +44 (0) 1223 326525 
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Kelly Grace 
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that they feel may be discriminatory or offensive. 
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POSITIONS OPEN 
MSTE MARINE LABORATORY 
& AQUARIUM 


2017-18 POSTDOCTORAL RESEARCH 
FELLOWSHIP 


Mote Marine Laboratory announces a new two-year 
position beginning January 2017, to support an in- 
dependent investigator based at Mote’s new facility 
(~ 26,000 square feet of research and science-education 
infrastructure) in the Florida Keys. Applications are in- 
vited from recent (since January 1, 2014) Ph.D. graduates, 
including those with firm expectation of graduation by 
December 2016. However, at time of appointment, a 
doctoral degree must have been awarded. Proposals for 
any field of marine research will be considered. Com- 
petitive applications will focus on research programs that 
are relevant to conservation and the sustainable uses of 
marine biodiversity, healthy habitats, and natural resources; 
will bring or propose new multi-investigator/institutional 
collaborations to Mote, and will be cognizant of global 
issues. For complete Fellowship information and ap- 
plication requirements see website: www.mote.org/ 
about-us/employment-opportunities. The deadline 
for applications is August 31, 2016 and finalists will be 
announced in October. 


Yale sCHOOL OF MEDICINE 

A position is available to join a multidisciplinary 
research team studying spinal cord injury and related 
chronic disorders (see Tan et al, J Neurosci., 28:13173- 
13183, 2008; Tan Prog Mol Biol Transl Sci., 131:385-408, 
2015; Bandaru et al, J Neurophysiol. 113:1598-1615, 
2015). Ph.D. and/or M.D. degree, and experience/ 
publications in neurophysiology with animal models 
of pain or motor dysfunction are essential. Experience 
with immunohistochemistry and rodent surgical pro- 
cedures, including models of SCI and im vivo electro- 
physiology, are strongly desired. Superb opportunity 
to work as part of a rapidly moving, collaborative team, 
applying state-of-the-art methodology to investigate 
SCI, glial scarring, neuropathic pain, and spasticity. Send 
statement of interest, Curriculum Vitae, and three letters 
of reference to: Stephen G. Waxman, M.D., Ph.D. or 
Andrew M. Tan, Ph.D., Neuroscience Research Center, 
Building 34, Veterans Adminstration Connecticut 
Medical Center, (127A), 950 Campbell Avenue, West 
Haven, CT 06516; e-mail: stephen.waxman@yale. 
edu; andrew.tan@yale.edu 


POSTDOCTORAL POSITION 
MAIZE CHROMOSOMES 


A POSTDOCTORAL POSITION is available in 
the Division of Biological Sciences at the University of 
Missouri, Columbia to study the epigenetic aspects of 
centromere function in maize in terms of the molecular 
parameters that condition activity and inactivity, the se- 
quence analysis of the supernumerary B chromosome 
in general and its centromere in particular, molecular 
aspects of B chromosome evolution, and exploration 
of the molecular basis of the drive mechanism of the 
B chromosome using sequence information. The B chro- 
mosome centromere has the advantage for such studies 
in that a specific repeat is present in and around this 
centromere that allows it to be distinguished from al 
others in the karyotype. The project will also be in- 
volved with the development and utilization of whole 
chromosome exonic paints for each maize chromosome. 
These paints will be used to examine chromosomal 
structure, behavior and evolution. The University of 
Missouri-Columbia has a vibrant plant biology group 
providing an excellent training environment. The Uni- 
versity of Missouri is an Affirmative Action/Equal Op- 
portunity Employer. Interested applicants should email 
their curriculum vitae including contact information of 
three references to James Birchler (e-mail: BirchlerJ@ 
Missouri.edu). 


POSITIONS OPEN 


MASTER OF BIOMEDICAL 
INFORMATICS, 
HARVARD MEDICAL SCHOOL 


Program Description The program provides 
the intellectual framework for clinicians and bio- 
medical scientists in the systematic and sound use 
of quantitative methods to increase agility with 
such methods in their respective domains. The 
program includes an intensive, hands-on quanti- 
tative boot camp, a range of foundational courses, 
and courses in emerging areas such as precision 
medicine, data science, and data visualization. All 
students are expected to complete a capstone re- 
search project and to participate in a longitudinal 
seminar series. 

Who is this Program for? (1)Postdoctoral stu- 
dents who recognize the relevance of informatics 
to their research 

(2)MD’s who are interested in qualifying for 
the subspecialty in clinical informatics 3)Medical 
students who would like to take a research year 
during their training to explore the importance of 
informatics in the practice of medicine 

Contact information to learn more about the 
program, please visit our website and email us 
with any questions through our ‘Contact Us’ page: 
https: //informaticstraining.hms.harvard.edu/ 


A POSTDOC RESEARCH ASSOCIATE Position is 
open at Wright State University (WSU) to study Cell Cycle 
and Checkpoint Signaling (website: http:/ people.wright. 
edu/yong-jie.xu). Candidates with experiences in mo- 
lecular and cell biology, genetics or protein biochem- 
istry are encouraged to apply online at website http:// 
jobs.wright.edu/postings/10105. 

Wright State University: Affirmative Action/Equal Opportu- 
nity Employer/Male/Female/Veteran/Disability. 
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Search results: 
Careers in 
high tech 


High technology permeates every corner of every enter- 
prise, from global computing corporations, to social media 
and search establishments, to retail giants. Not surprisingly, 
these industries offer attractive playgrounds for Ph.D.-level 
scientists and engineers. By Alaina G. Levine 


by high technology. Whether you are searching for 

an old friend or buying coffee on the Internet, billions 
of lines of code, petabytes of data, and a potentially infinite 
amount of brain power make it possible. And behind every 
invention are scientists and engineers. As more industries 
are influenced by big data and computer-based systems, 
the need for talented Ph.D.-level science, technology, engi- 
neering, and mathematics (STEM) professionals to contrib- 
ute to these arenas has grown considerably. 

High tech jobs are exciting and diverse: The problems 
they address are interesting and intense, involve multifunc- 
tional (and in many cases, multinational) teams, and offer 
the chance to make a difference that is felt by customers 
the world over. As Nicholas Clinton, a developer advocate 
at Google with a Ph.D. in environmental science, policy, 
and management, says: “It’s great to feel like I’m part of 
something impactful, with real power to effect global-scale 
change.” 

As a member of the developer relations team for Google 
Earth Engine, a platform for Earth science analysis, Clinton 
strives to ensure that external developers are able to utilize 
the instrument effectively. He collaborates with the Earth 
Engine engineering team to help them identify user needs 
and to improve the platform. “I conduct a lot of trainings, 
give a lot of lectures, and create documentation to enable 
users to do incredible things,” he says. “I ensure that 


f\ Imost every moment of our day is somehow touched 


Upcoming Features 
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researchers can use Earth Engine to perform high-impact, 
data-driven science.” 

Sun Mi Chung, a Ph.D. astrophysicist and principal data 
scientist at AOL, also appreciates how rapidly her work af- 
fects the public. In her job, she applies machine learning 
techniques to optimize real-time bidding for advertisements 
on the AOL platform. “We have to think deeply about what 
makes sense in terms of the algorithms we use and whether 
we can put it into production quickly,” she says. Chandra 
Narayanan’s doctorate is in oceanography, and as director 
of data science for Facebook, he has engaged with almost 
every product in the company. With a background in creat- 
ing numerical models for Earth systems, he was working 
for the National Weather Service when he heard that a new 
group was forming at PayPal that was eventually to become 
one of the first data science groups in industry. Narayanan 
came on board at PayPal in 2007, where his responsibilities 
included risk analytics and fraud identification. 

His entry into Facebook in 2010 was facilitated by a for- 
mer colleague. He initially joined the social network in its 
risk management practice, but every few months, “I took 
on anew portfolio,” says Narayanan. His accomplishments 
include building from scratch the teams that focus on Insta- 
gram, games, risk, payments, and advertisements. But he is 
most proud of his ability “to be able to charter a new course 
for what data science means in industry,” he says. “Many 
companies are using Facebook as their model to form data 
science teams.” 


Investigating the diversity of destinations 

Not surprisingly, data science careers are particularly 
prominent in the high tech space. David Evans is a com- 
putational linguist with a Ph.D. in computer science from 
Columbia University. He is also passionate about Japanese 
language and culture and had studied it since he was an 
undergraduate. An internship at IBM Japan while in grad 
school solidified his interest to work in that country and 
combine his two loves. While pursuing a postdoc at the 
National Institute of Informatics in Tokyo, Japan, an Ama- 
zon recruiter contacted him about an opening related to 
information retrieval and searching. The company cont.> 
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“| address grand 


develop solutions 
that affect 


people’s lives.” 


needed someone who had both data analytic skills and a 
prowess in Japanese linguistics. “Because Japanese and 
English are so different, there are very different ways of 
searching for information in those languages,” says Evans. 
Given his research in information retrieval and the fact that 
he was bilingual, “it made sense for me to go to Amazon,” 
he says, and today he is a senior search engineer working 
for AY, a wholly owned subsidiary of Amazon Japan. 

With data being utilized in increasingly new and creative 
ways, the diversity of career paths in high tech companies 
has increased, especially in multinational firms like IBM. 
Kristen Beck and Temitope A. Ogunyoku are both IBM 
employees and scientists who hold a Ph.D.—Beck’s doctor- 
ate is in biochemistry, molecular, cellular, and developmen- 
tal biology, and Ogunyoku’s is in civil and environmental 
engineering. Their jobs and career paths are very different 
and are on opposite ends of the planet. And neither of them 
do what one might expect at Big Blue. 

Beck, who is based at IBM Almaden Research Center 
in Silicon Valley, works on bioinformatics problems in as- 
sociation with the University of California, Davis and Mars, 
Incorporated. She is examining ways in which analytics 
can be applied to food safety on various fronts, including 
pathogen detection, antibiotic resistance, and food fraud 
or mislabeling. She leverages her biology background to 
implement solutions based on life science tools, such as 
next-generation sequencing. 

Ogunyoku is a research scientist with IBM Research— 
Africa in Nairobi, Kenya, one of only 12 global research labs 
in IBM’s portfolio, where “I address grand challenges in 
Africa and develop solutions that affect people’s lives,” she 
says. Her focus is on creatively utilizing analytics to scruti- 
nize complex interconnected datasets and deploy solutions 
in fields such as public safety and waste management. For 
example, her team monitored social media in Kenya for 
data about crime, because people use it as a platform to 
report public safety concerns. “We used algorithms and 
natural language processing systems to detect and deter- 
mine the credibility of these incidences,” she explains, add- 
ing that the goal of this research was to develop a product 
that can be used by security companies to alert their clients 
of criminal activity. 


Searching for an “in” 

At Facebook, there are multiple entry points for Ph.D.-lev- 
el scientists and engineers interested in joining the compa- 
ny. Your doctorate gives you access to jobs in product man- 
agement, engineering, design, analytics, user experience 


research and even marketing and sales, says Narayanan. 
The key to employment? “Love the mission, be quantita- 
tive, be interested in solving hard problems and building 
awesome products,” he stresses. As the head of recruit- 
ment for analytics, he looks for candidates who display a 
“ton of curiosity, drive and leadership, have a highly analytic 
nature, enjoy a fast-paced environment,” and of course 
have superior coding skills. Interestingly, new employees in 
Facebook’s analytics department come in through a central 
pool, and after a five-week boot camp and orientation, can 
pick which group they want to work with. 

Similarly, at Amazon, Ph.D. scientists are recruited for 
their technical expertise, and “you get to come in and look 
for a way to apply your work,” says Evans. “Your career is 
up to you. Amazon matches capabilities to interests and 
interests to projects.” For his team, he looks for profession- 
als with a background in machine learning, computational 
linguistics, and information retrieval. But the key to getting 
a job, especially in software development and analytics, is 
to clarify “how what you are doing now can be applied to 
products [and systems] at the company,” he adds. That’s 
essentially how Clinton landed a position at Google. “The 
more you can demonstrate how Google can leverage your 
research and development work to achieve amazing, broad- 
ly applicable results, the better [your chances for getting a 
job],” adds Clinton. 

In smaller organizations and startups, the hiring process 
tends to focus on immediate needs, as dictated by the 
business plan. When Kamal Jain, CEO and founder of 
Faira, a technology company for real estate, recruits, he 

' looks for people with skills 
that match the task to be 
done. Radu Rusu, CEO 
and cofounder of Fyusion, 

a startup looking to reinvent 
the use of 3D imaging for 
consumer applications, pores 
over publications to find 
“research results that match 
our roadmap,” he says. But 
he also keeps an eye out for 
scientists who possess hon- 
esty, humility, and flexibility, 
a marker of their potential to 
prosper in his organization. 


‘Kamal Jain 


Navigating a new culture 

As you transition into high tech, it is important to recog- 
nize the variances in culture among these types of com- 
panies, as compared to other sectors. One of the features 
of Google’s culture that Clinton immediately noticed is its 
emphasis on teams, which takes a different approach than 
what is usually found in universities. “The team environment 
is a big change from academia, where you work in collabo- 
rations, but a lot of time is spent on independent study,” 
he says. At Google, “you need very tight teamwork, timing, 
communication, and camaraderie to compete successfully.” 
At FICO, the financial services company, teams are always 
interdisciplinary, says Scott Zoldi, chief analytics officer, 
who holds a doctorate in physics. “You have to cont.> 
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The National Academies of 
SCIENCES * ENGINEERING * MEDICINE 


Graduate, Postdoctoral, and Senior Research Awards 
offered for research at 
US federal laboratories and affiliated institutions 


Opportunities for research in all areas of science and 

engineering 

e — Awards for independent research at over 100 participating 
laboratory locations 
12-month awards renewable for up to 3 years 
Annual stipend $45,000 to $80,000 for recent PhD recipients 
and higher for additional experience. Graduate entry level 
stipend is $30,000 and higher for additional experience 
Relocation, professional travel, and health insurance 
Annual application submission deadlines February 1, May 1, 
August 1, November 1 

e — Open to international applicants 


Detailed program information, including instructions on how to apply 
online can be found on the NRC website at: 


www.nationalacademies.org/rap 


Applicants must contact Adviser(s) at the lab(s) prior to application deadline to 
discuss research interests and funding opportunities. 
Questions should be directed to the: 
NRC Research Associateship Programs 
TEL: 202-334-2760; EMAIL: rap@nas.edu 
Qualified applicants will be reviewed without regard to race, religion, color, 
age, sex or national origin. 


The 20 aie 
Fulbright U.S. Scholar 
Competition is open 
Opportunities in over 125 countries for faculty, 


administrators, postdocs, professionals, artists, 
independent scholars and many others. 


® For more information on recent program 
innovations, including flexible, multi-country 
opportunities, please visit: 


-uloright 


SCHOLAR PROGRAM 


aw wu K & 


THE UNIVERSITY OF HONG KONG 


Founded in 1911, the University of Hong Kong is committed to 
the highest international standards of excellence in teaching and 
research, and has been at the international forefront of academic 
scholarship for many years. The University has a comprehensive 
range of study programmes and research disciplines spread 
across 10 faculties and over 140 academic departments 
and institutes/centres. There are 28,000 undergraduate and 
postgraduate students who are recruited globally, and more 
than 2,000 members of academic and academic-related staff 
coming from multi-cultural backgrounds, many of whom are 
internationally renowned. 


Post-doctoral Fellowships 


Applications are invited for a number of positions as Post-doctoral 
Fellow (PDF) at the University of Hong Kong. Appointments will 
be made for a period of 2 to 3 years and the appointees must be 
in post on or before February 28, 2017. 


PDF posts are created specifically to bring new impetus and 
vigour to the University’s research enterprise. Positions are 
available from time to time to meet the strategic research needs 
identified by the University. Positions are available in the following 
Faculties/Departments/Schools/Centres/Units: 


Faculty of Architecture e Medicine 
Real Estate and e Microbiology 
Construction © Orthopaedics and 
Civil Engineering Traumatology 
Computer Science Pathology 
Electrical and Electronic School of Public Health 
Engineering Centre for Reproduction, 
Mechanical Engineering Development and 
School of Biomedical Growth 
Sciences Chemistry 
Centre for Cancer Physics 
Research Geography 
Research Centre of Heart, Psychology 


Brain, Hormone and The State Key Laboratory 

Healthy Aging for Liver Research 
Centre of Influenza 

Research 


Post-doctoral Fellows 

PDFs are expected to devote full-time to research. Applicants 
should be doctoral degree holders having undertaken original 
research that has contributed to the body of knowledge. A 
highly competitive salary commensurate with qualifications and 
experience will be offered. Annual leave and medical benefits 
will also be available. 


Procedures 

Prospective applicants are invited to visit our webpage at 
http://jobs.hku.hk/ to view the list of the Faculties/Departments/ 
Schools/Centres/Units and their research areas for which PDF 
positions are currently available. Before preparing an application, 
they should contact the Head of the appropriate academic unit, or 
the contact person as specified, to ascertain that their research 
expertise matches the research area for which a vacant PDF 
post is available. 


Applicants must submit a completed University application form, 
which should clearly state which position they are applying for, 
and in which academic discipline. They should also provide 
further information such as details of their research experience, 
publications, research proposals, etc. 


Application forms (341/1111) can be downloaded at 
http://www.hku.hk/apptunit/form-ext.doc and further particulars 
can be obtained at http://jobs.hku.hk/. Closes April 17, 2016. 
The University thanks applicants for their interest, but advises 
that only candidates shortlisted for interviews will be notified of 
the application result. 


The University is an equal opportunities employer and is committed to a Non-Smoking Policy 
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be a good listener and a good collaborator,” he says. “It’s a 
rich environment, where different points of view are not just 
welcome—they’re expected. It’s not going to be one Ph.D. 
scientist solving the problem, but rather a group of people, 
from many different fields, working together....[This] yields 
better results.” 

As Narayanan made his way through PayPal and Face- 
book, he was intrigued by how much his scientific skills 
easily transferred to the high tech industry. As an ocean- 
ographer, he was used to applying models to understand 
processes associated with natural phenomena. At PayPal 
and Facebook, he tapped into the same set of abilities. 

“It was easy to jump in. In fact it was seamless,” he says. 
“Being able to analyze data, recognize patterns, summarize 
results, break down problems in the simplest way—these 
are the kinds of things | learned prior to joining industry.” 

Evans notes that the culture of Amazon encourages em- 
ployees to identify ways to improve the company, whether 
or not that improvement is related to their job function. “We 
have the responsibility. We can take a real ownership of a 
problem,” he says. This translates to an ecosystem where 
individuals have a large amount of influence and freedom. 
One of the main projects he’s worked on had little to do 
with search capabilities. Rather, it was a company-wide 
effort that he spearheaded on his own, relating to setting 
the time of product launches according to local time zones 
as opposed to a central clock working off of the Seattle 
headquarters. “We had to replumb everything,” he says, 
referring to programming the systems to make it easier for 
customers to purchase items. It took years of partnering 
with teams across the planet, but “it felt surprisingly pow- 
erful to make this change worldwide.” 

But Evans also clarifies that Amazon’s philosophy is 
not for everyone. “There is a lot of pressure,” he admits, 
“and it’s important to know your limits to achieve a 
work-life balance.” For someone coming straight from 
academia, adjusting to this fast-paced ecosystem might 
be challenging. 


Honing skill sets to achieve success 

Although there are ample professional opportunities in 
the high tech sector, it is critical for candidates to differ- 
entiate themselves from the competition, and certain skill 
sets are particularly advantageous to hone. For software 


Kristen Becks 
v 


engineering and data science careers, it is vital to under- 
stand databases and algorithms and how to apply them to 
solve real-world problems, says Michael Li, whose Ph.D. is 
in mathematics. He worked for Intel, Google, Foursquare, 
and JPMorgan Chase before launching the Data Incuba- 
tor, which trains STEM Ph.D.’s for data science careers. He 
emphasizes that technical know-how is what hiring manag- 
ers crave. “No one needs just an ‘ideas’ person. They need 
someone who can actually get the job done.” 

“The people who are the most successful, marketable, 
and valuable do their job and also understand the broader 
picture,” says John Heinlein, vice president of marketing 
for ARM, a global designer of semiconductor intellectual 
property, whose Ph.D. is in electrical engineering. “They 
don’t stay in silos. You might never change your role, but 
you'll do a better job if you understand what’s happening to 
the right, left, up, and down in the organization.” 


Advancing your career into new realms 

One aspect of high tech companies that is especially at- 
tractive to Ph.D. scientists and engineers is the flexibility to 
determine your own career path. The competition for top 
talent is fierce, and firms want to retain the brightest minds. 
So they offer their employees wiggle room to design their 
own career advancement strategy. For example, it is not 
uncommon to find lateral moves encouraged. 

At Amazon, “I could move back to the States and still 
remain with my current team,” says Evans of his career op- 
tions in the future. Adds Ogunyoku: “At IBM, you are able 
to reinvent yourself. | can go work for the design team or a 
global business unit, [among other choices]. Having a Ph.D. 
doesn’t limit me to only research and development.” 

With this level of latitude across the high tech arena, job 
prospects and career decisions may seem overly complex. 
But there is a simple way to determine your next course of 
action in crafting a career there: Articulate your own values. 
“If you pick opportunities that align with your passion, that 
will help you be successful,” says Beck. “You'll feel like you 
are part of the larger picture, and it allows you to be an am- 
bassador for the cause of your choosing.” 


Alaina G. Levine is a freelance science writer based in Tucson, AZ. 


DOI: 10.1126/science.opms.r1600162 
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USC University of 
‘ Southern California 


Postdoctoral Position 
at the 
University of Southern 
California 


Postdoctoral position is available 
immediately in Dr. Yang Chai’s laboratory 
at the Center for Craniofacial Molecular 
Biology University of Southern California 
in Los Angeles. We are interested in the 
regulation of developmental patterning, 
organogenesis and mesenchymal stem cells. 
Our studies will seek to define molecular 
mechanisms governing both normal and 
abnormal craniofacial development, 
providing scientific rationales for future 
therapeutic strategies to prevent and treat 
craniofacial birth defects. 


The candidate must have a PhD and 
be experienced with molecular and 
developmental biology. For details, please 
visit https://dent-web10.usc.edu/ccmb/ 
faculty_detail.asp?RS=1 


Send application, resume and three letters 
of recommendation to Dr. Yang Chai (c/o 
pathomps@usc.edu). 


EQOE/AA 


One Goal... > 


Pfizer Worldwide Research and Development Postdoctoral Program 


At Pfizer, postdocs are trained in the art and science of drug discovery, and work side-by-side 
with scientists who are expert in cutting-edge biology, disease mechanisms, drug delivery 
and mechanisms of action, and the engineering of novel therapeutic proteins, vaccines, 
and nucleic acids. Areas of scientific focus include cardiovascular and metabolic diseases, 
clinical research, comparative medicine, drug safety, biotherapeutics/protein engineering, 
inflammation and immunology, human exploratory biology, medicinal chemistry, 
neuroscience and pain, oncology, pharmacology, and vaccines, among several others. 


We recruit highly motivated Ph.D. recipients with an outstanding record of scientific 
productivity and a passion for ground-breaking, fast-paced research that facilitates 

the development of innovative therapies for human diseases. Our program promotes 
dissemination of research through publications and participation in scientific meetings, 
provides opportunities for collaboration with leading academic labs and industry consortia, 
and offers exceptional professional development training and networking opportunities. 


To explore our program and research, visit us online at: 
www.pfizercareers.com/student-programs/postdoc 


Pfizer] Working together for a healthier world’ 


www.pfizerca reers.com 


University of California 
San Francisco 


Postdoctoral Positions 


The Institute for Neurodegenerative Diseases 
(IND) is recruiting postdoctoral research 
fellows to work at the IND with Director 
Stanley B. Prusiner. 


The IND’s mission is to investigate and create 
therapies for neurodegenerative diseases: 
Alzheimer’s, Parkinson’s, multiple system 
atrophy, chronic traumatic encephalopathy, 
PrP prion diseases. Research opportunities 
range from the fields of molecular, cell and 
structural biology and transgenic rodent 
models to mass spectrometry, synthetic 
chemistry and computer modeling. A major 
effort is devoted to an extraordinary drug 
discovery program. 


The IND is housed in the state-of-the-art 
Sandler Neurosciences Center on the UCSF 
Mission Bay campus, greatly facilitating 
interactions with many different scientists 
possessing diverse expertise. 


Applicants should hold or be close to 
completing an M.D. or Ph.D. degree and 
possess superior writing skills. Interested 
candidates should apply online at: http://ind. 
ucsf.edu/careers/postdoctoral 


Immediate Opening for Postdoctoral/Research Associate Level 
Neurophysiologist 


VeroScience is a small but very well developed biotechnology company focusing on neuroendocrine 
therapies for metabolic and immunological diseases. The company developed and owns Cycloset®, 
an FDA-approved therapy for the treatment of type 2 diabetes. VeroScience also has a strong pipeline 
of metabolic disease products and therapies for immunological disorders. The company is a hybrid 
of academic environment mindset and industrial focus within a small and efficient organization. 
The company conducts preclinical and clinical research nation-wide, utilizing strong academic and 
pharmaceutical industry collaborations to advance its development programs. 


VeroScience has an immediate opening for a postdoctoral level neurophysiologist-electrophysiologist, 
with demonstrated expertise in the use of in vivo electrophysiological techniques to study 
neuronal activities within the central nervous system, including the use of direct wire micro- 
electrode recordings. The successful candidate will have expertise in studies with both in vivo 
and in vitro models systems to investigate synaptic signal transduction activities between neurons 
in the central nervous system. A working knowledge of hypothalamic functions in the regulation 
of metabolism, particularly in insulin resistant states with a working knowledge of fuel sensing 
mechanisms within the central nervous system and hypothalamus would be a positive attribute for 
this position, though it is not required. Expertise in diverse molecular and cellular techniques to 
investigate neuronal signal responses to various fuel and neuro-modulatory exposures would be 
beneficial. Such activities will be studied in the context of the development and treatment of various 
metabolic disease states (diabetes, metabolic syndrome) and the impact of novel neuroendocrine 
therapies upon these disease states. The candidate will be part of an interdisciplinary team of scientists 
including neuroscientists, molecular biologists, chemists, zoologists, metabolic physiologists, 
chronobiologists, and endocrinologists in this metabolic disease therapy development program. 
Also, the group will interact with external collaborators from academia and industry. However, 
the candidate must be able to work independently and generate reports and publications from such 
investigations. Importantly, the successful candidate must have an excellent command of the 
English language and a demonstrated ability to publish well written articles in peer reviewed 
journals in this research area. 


VeroScience offers competitive salaries and benefits as well as a very unique and stimulating working 
environment that allows one’s efforts and achievements to be quickly applied to real world health 
problems. Please send CV, names and contact information of three references, and a | page summary 
of scientific interests to Anthony_Cincotta@VeroScience.com. 
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GBP UKE 


The Chinese University of Hong Kong 


corre 
Applications are invited for:- 


Department of Microbiology 

The Department has a wide range of research facilities and access to a large comprehensive teaching hospital. The 
establishment provides a good environment for basic as well as clinical research and facilitates collaboration with 
other disciplines. Further information about the Department is available at http://www .cuhk.edu.hk/med/mic/. 


(1) Assistant Professor (Non-clinical) 

(Ref. 1516/193(665)/2) (Closing date: April 18, 2016) 

Applicants should have (i) a PhD or equivalent; (ii) a strong research track record in the field of microbiology; and 
(iii) commitment to undergraduate teaching and postgraduate student supervision. Experience in gut microbiome, 
microbiota and faecal transplant research will be an advantage. 

The appointee will (a) undertake teaching and related educational activities for undergraduate and postgraduate 
students; (b) supervise MPhil and PhD students; (c) apply for competitive research grants and related funding; and 
(d) conduct high-standard research projects independently and in collaboration with other parties. 

Appointment will normally be made on contract basis for three years initially commencing as soon as possible, 
which, subject to mutual agreement, may lead to longer-term appointment or substantiation later. 


(2) Research Assistant Professor 

(Ref. 1516/194(665)/2) (Closing date: April 18, 2016) 

Applicants should have (i) a PhD or equivalent; and (ii) a strong research track record in the field of microbiology. 
Experience in gut microbiome, microbiota and faecal transplant research will be an advantage. 

The appointee will (a) apply for competitive research grants and related funding; and (b) conducting high-standard 
research projects independently and in collaboration with other parties. 

Appointment will initially be made on contract basis for three years commencing as soon as possible, renewable 
subject to mutual agreement. 


Salary and Fringe Benefits 

Salary will be highly competitive, commensurate with qualifications and experience. The University offers a 
comprehensive fringe benefit package, including medical care, plus a contract-end gratuity for appointments of 
two years or longer, and housing benefits for eligible appointees. Further information about the University and 
the general terms of service for appointments is available at https://www2.per.cuhk.edu.hk/. The terms mentioned 
herein are for reference only and are subject to revision by the University. 


Application Procedure 

Application forms are obtainable (a) at https://www2.per.cuhk.edu.hk/, or (b) in person/by mail with a stamped, 
self-addressed envelope from the Personnel Office, The Chinese University of Hong Kong, Shatin, Hong Kong. 
Please send the completed application form and/or full curriculum vitae, together with copies of qualification 
documents, a publication list and/or abstracts of selected published papers, and names, addresses and fax numbers/ 
e-mail addresses of three referees to whom the applicants’ consent has been given for their providing references 
(unless otherwise specified), to the Personnel Office by post or by fax to (852) 3942 0947 by the closing date. 
Please quote the reference number and mark ‘Application — Confidential’ on cover. The Personal Information 
Collection Statement is obtainable at https://www2.per.cuhk.edu.hk/. 


Advance your career 
with expert advice from 
Science Careers. 


Download Free Career Advice Booklets! 
ScienceCareers.org/booklets 


Featured Topics: 

= Networking 

= Industry or Academia 
" Job Searching 

= Non-Bench Careers 

= And More 


Science Careers 


FROM THE JOURNAL SCIENCE PAVAAAS 


POSITIONS OPEN 


an UNIVERSITY OF SOUTH CAROLINA 

wile SCHOOL OF MEDICINE 
UNIVERSITY OF SOUTH CAROLINA 

TENURE-TRACK ASSISTANT PROFESSOR 


The Department of Pathology, Microbiology, and 
Immunology at the University of South Carolina’s 
School of Medicine invites applications for a tenure- 
track ASSISTANT PROFESSOR position with expertise 
in Immunology or Microbiology. Special considera- 
tion will be given to candidates with expertise in Mi- 
crobiome or other Omics research. The successful 
candidate is expected to develop a strong extramurally 
funded research program complementing current fac- 
ulty research interests (http://pmi.med.sc.edu/), and 
participate in teaching. The department is currently 
ranked in the top 15 among Pathology departments in 
the nation in NIH funding, and hosts several NIH-fuunded 
Research Centers including the Center for Complemen- 
tary and Alternative Medicine, the Center of Biomedical 
Research Excellence on Dietary Supplements and In- 
flammation, and the IDeA Network of Biomedical Re- 
search Excellence. The department and Centers provide 
excellent mentoring opportunity to junior faculty. Can- 
didates must have a PhD or equivalent, and at least 3 years 
of postdoctoral research experience. Competitive salary 
and startup funds are available. Please submit curriculum 
vitae and a statement of research and teaching interests 
with names of 3 references to Dr. Mitzi Nagarkatti, 
Chair, Department of Pathology, Microbiology, and 
Immunology, University of South Carolina School 
of Medicine, Columbia, SC 29208 or e-mail: pmi. 
immunology@uscmed.sc.edu. The search will start im- 
mediately and will continue until the position is filled. 
University of South Carolina Columbia is an Equal Opportunity 
Affirmative Action Employer and encourages applications from women 
and minorities and is responsive to the needs of dual career couples. 


California State University, Long Beach 
TENURED PROFESSOR 
Health, Biomedical, or Behavioral Sciences, 
or Engineering 

California State University, Long Beach (CSULB) 
invites applicants with record of accomplishments to 
qualify for appointment at the rank of Full Professor 
with tenure starting August 17, 2016. The successful 
candidate will serve as a 50% Principal Investigator (PI) 
with two other PIs on the National Institutes of Health 
funded CSULB BUILD (Building Infrastructure Lead- 
ing to Diversity) award (website: http://www.csulb. 
edu/build). 50% time academic year for teaching and 
service in discipline area. For further information, see 
the position description at website: www.csulb.edu/ 
divisions /aa/personnel/jobs/posting/2362/ 
index.html. Screening of applications to begin April 18, 
2016. CSULB is an Equal Opportunity Employer. 


Post your jobs Fast and Easy 


ScienceCareers 


employers.sciencecareers.org 


% CUNY ADVANCED SCIENCE RESEARCH CENTER 


The CUNY Advanced Science Research Center (ASRC) seeks a dynamic and innovative scientist with demonstrated leadership and 
research accomplishments in Photonics to serve as: 


CUNY ASRC Director of Photonics & Professor 


The ASRC is a hub of scientific exploration in Upper Manhattan, the centerpiece of an integrated network that brings together 
researchers from a number of science’s most dynamic disciplines — Nanoscience, Photonics, Structural Biology, Neuroscience, and 
Environmental Sciences — in a highly collaborative research environment. Offering state-of-the-art facilities and instrumentation to 
CUNY scientists of all levels — faculty, postdoctoral fellows, and graduate and undergraduate students — and other researchers from 
the New York City scientific community, the center positions the University at the vanguard of scientific research and education. 


online @sciencecareers.org 


The successful candidate will be expected to develop a center of excellence in photonics research to complement and be integrated 
with existing activities within the ASRC, CUNY, and the NYC metro area. The candidate is expected to: lead an internationally 
leading research program; recruit new faculty; build consortia; contribute to teaching at one of the senior CUNY colleges; oversee the 
Photonics Initiative; and ensure compliance with federal research guidelines and University policies. 
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Applicants must be accomplished researchers with international stature in a photonics area with an outstanding record of scholarly 
activities and possess appropriate credentials for a senior faculty appointment at one of the CUNY colleges. The Director will develop 
the Photonics Initiative into an integral component of the ASRC’s scientific portfolio. Preference will be given to those whose 
experimental research focus areas include but are not limited to one or more of the following: biophotonics, nanophotonics, terahertz 
technology, ultrafast spectroscopy, single molecule spectroscopy, or plasmonics. 


For more information about the CUNY ASRC Nanofabrication Facility, please visit nanofab.asrc.cuny.edu 
For general information about the CUNY ASRC, please visit asrc.cuny.edu 


of 
New York We are committed to enhancing our diverse academic community by actively encouraging people with disabilities, minorities, veterans, and 
women to apply. We take pride in our pluralistic community and continue to see excellence through diversity and inclusion. EO/AA Employer. 


AO To apply, visit asrc.cuny.edu/jobs 
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Universitatsklinikum VVurzburg U (< 


The Interdisciplinary Center for Clinical Research (IZKF) organizes the internal research funding of the 
Medical Faculty of the University of Wiirzburg. Its major goal is to strengthen clinical research on the basis of 
interdisciplinary cooperation between clinical and basic research groups. To carry out its mission, the IZKF 
supports cooperative research grants, promotes training and advancement of young researchers in medicine and 
improves the scientific infrastructure. 


AAAS is here — The IZKF intends to establish a new 
helping scientists achieve Junior Research Group 


Rt ee Tissue Regeneration in Musculoskeletal Diseases 


ted LE rata tae ee to be affiliated with the Musculoskeletal Center Wiirzburg (MCW). We are looking for a researcher with 

of the information, advice, and opportun- outstanding postdoctoral experience and international recognition in the general fields of tissue regeneration. 

ities they need to take the next step in This research group should focus on molecular dissection and reconstitution of early tissue regeneration 

heir careers. including stem cell technologies, physical and biochemical cues and materials / scaffolds, in order to support 
concept strategies to establish e.g. SFB initiatives in these areas of research. 

Acomplete career resource, free to the 


public, Science Careers offers hundreds n addition to the position of the Group Leader the grant of the IZKF will provide funding for up to 5 years for: 
of career development articles, webinars © a postdoctoral scientist © aPhD student 

and downloadable booklets filled with e a technician e Consumables and start-up funding 
practical advice, a community forum Laboratory space and basic equipment will be provided at the Department of Orthopedics Kénig Ludwig Haus. 


providing answers to career questions, 5 ‘ F < 
and Hoel of job nee academia Further information on our homepage at http://www.izkf.ukw.de/ 


government, and industry. As a AAAS nterested individuals should send a one-page description of their research interests and future directions, CV 
member, your dues help AAAS make this and publication list, and the names of three academic referees by 14.04.2016. We may request that short-listed 
service available to the scientific community. candidates provide a more detailed research proposal at a later date. These candidates also will be invited to 
f you're not a member, join us. Together resent their research in Wiirzburg. 
weicanakesiailieiencs. Preference will be given to people with disabilities in the case of otherwise equal aptitude. The University aims to 
To learn more, visit increase the proportion of female researchers; therefore applications from qualified women are particularly welcome. 


aaas.org/plusyou/sciencecareers Applications should be sent via email in one pdf-file to the IZKF office to: 


WN _ hawks_m@ukw.de 
AAAS v U am A Informal inquiries can be made to Prof. Heike Walles — heike.walles@uni-wuerzburg.de 


APPLICATION DEADLINE: 14.04.2016 
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Issue date: April 8 


Ads accepted until April 1 if space allows 


For recruitment in science, there’s only one Science. 


Looking to hire a cancer researcher? Reach them through the 
pages of Science. Our upcoming cancer feature explores how 
major institutions are planning to prepare for the challenges 
involved in precision medicine. This hot research area is sure 
to draw the readers you need to reach. 


subscribers in print 
every week 


= Read and respected by 400,000 readers around the globe 


Your ad dollars support AAAS and its programs, which 


strengthens the global scientific community. oe 


= Relevant ads lead off the career section with a special 
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cancer research banner 


are Ph.D.s 
= Bonus distributions: 


American Association for Cancer Research, 
April 16-20, New Orleans, LA 
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\\ Stony Brook Medicine Chief, Division of Endocrinology 


Stony Brook University School of Medicine at the State University of New York invites applications for the position of Division Chief of Endocrinology from 
experienced Endocrinology faculty. 


Required Qualifications: Physician candidates with U.S. Board Certification in Internal Medicine and either an MD or combined MD/Ph.D. degrees. High level 
of clinical competence. Collaborative leadership style. Strong administrative skills. Strong scholarly background and a vision to expand the division's existing 
clinical, research and training endeavors. Strong interpersonal and communication skills. Experience in budgetary/fiscal/personnel management as well as 
proven mentorship ability. 

Preferred Qualifications: Demonstrated publication and/or funding track record in clinical, translational, and/or basic research. Commitment to advancing 
scientific knowledge of the field of Endocrinology, as well as demonstrate a strong commitment to clinical, educational and academic excellence. Experience 
with development, management and expansion of a robust clinical program. 


Responsibilities & Requirements: The Division Chief of Endocrinology is a full time position reporting directly to the Chair of the Department of Medicine. 
Responsibilities include overseeing the clinical care delivered by faculty within the division, promoting investigator-initiated scholarship and engaging in the 
training and professional development of junior faculty, fellows, residents and students. Other responsibilities include strengthening collaborative relationships 
with related departments. A critically important role for the Division Chief will be to engage faculty physicians in delivering high-quality care, improved service, 
and lower costs for our patients. Under strong leadership, the Division is anticipated to evolve into a major national and regional force in Endocrinology. 


To qualify for an appointment as Associate Professor or Professor, the candidate must meet the criteria established by the School of Medicine (School of 
Medicine's Criteria for Appointment, Promotion and Tenure). 
Anticipated Start Date: ASAP. 
Salary: Commensurate with experience. 
Application Procedure: Those interested in this position should submit a State employment application, cover letter and resume/CV to: 
Vincent W. Yang, MD, Ph.D. 
c/o Susan Legrady 
Department of Medicine 
Health Sciences Center, Level 16, Room 020 
Stony Brook University 
Stony Brook, NY 11794-8160 
Fax: (631) 444-3144 


For a full position description, or application procedures, visit: www.stonybrook.edu/jobs (Ref. # F-9618-16-03). 
Equal Opportunity Employer, females, minorities, disabled, veterans 


FUNDING OPPORTUNITIES — U.S. Department of Defense 
Defense Health Program 


Peer Reviewed Medical Research Program 


The Peer Reviewed Medical Research Program (PRMRP) funds exceptional research with the goal to improve the health and well-being of all military Service Members, 
Veterans, and their beneficiaries. The PRMRP received $278.7 million in fiscal year 2016 (FY16) and seeks grant applications in the following topic areas: 


Acute Lung Injury Fragile X Syndrome Mitochondrial Disease Rett Syndrome 
Antimicrobial Resistance Hepatitis B Nanomaterials for Rheumatoid Arthritis 
Chronic Migraine and Hereditary Angioedema Bone Regeneration Scleroderma 
Post-Traumatic Headache Hydrocephalus Nonopioid Pain Management Sleep Disorders 
Congenital Heart Disease Inflammatory Bowel Disease Pancreatitis Tinnitus 
Constrictive Bronchiolitis Influenza Pathogen-Inactivated Dried Plasma Tuberculosis 
Diabetes Integrative Medicine Polycystic Kidney Disease Vaccine Development for 
Dystonia Interstitial Cystitis Post-Traumatic Osteoarthritis Infectious Disease 
Emerging Infectious Diseases Lupus Psychotropic Medications Vascular Malformations 
Focal Segmental Malaria Pulmonary Fibrosis Women’s Heart Disease 


Glomerulosclerosis : 
Metals Toxicology Respiratory Health 


Descriptions of the FY16 PRMRP Program Announcements * Clinical Trial Award * Investigator-Initiated Research Award 
and General Application Instructions are anticipated to be * Discovery Award * Technology/Therapeutic Development Award 
posted on Grants.gov by mid-March 2016: * Focused Program Award 


All applications must conform to the Program Announcements and General Application Instructions that will be available for electronic downloading from the Grants.gov website (all viewable 
under CFDA number 12.420). Execution management support will be provided by the Congressionally Directed Medical Research Programs. 


For more information, please visit: http://cdmrp.army.mil/funding/prmrp.shtml 
http://cdmrp.army.mil 
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= Read and respected by 400,00 readers around the globe 


Your ad dollars support AAAS and its programs, which 
strengthens the global scientific community. 


Relevant ads lead off the career section with special 
immunology banner 


* Bonus distribution to Immunology 2016 (AAI), 
May 13-17, Seattle, WA. 


Dedicated landing page for jobs in immunology 


job board. 
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Job Vacancies in China’s Universities 


ERER Sci 
fe BEER ScienceCareers 


China’ s Rapid Development 
— More Opportunities 


FROM THE JOURNAL Science BYAAAS 


Nanjing Normal University(Nanjing, China) 


Distinguished professors wanted all over the world at Jangsu, China for 
multiple disciplines, and provides competitive conditions. 


Southeast University (Nanjing, China) 


Southeast University invites applications from outstanding scientists for 
tenure track positions, and positions of chair professors and visiting 
professors, 


Nanjing University of Aeronautics and Astronautics (Nanjing, China) 


NUAA hunts for talents from home and abroad in Natural Sciences, 
Humanities, Social Sciences, Management, Aeronautics, Astronautics and 
Civil Aviation, . 


Sichuan University (Chengdu, China) 


Open Faculty and Postdoc Positions at Industrial Internet Institute offered: 
physics, mathematics, automation control, computer science, artificial 
Intelligence, information technology, mechanical engineering, 


The Institute for Advanced Study (IAS) of Shenzhen University 
( Shenzhen, China ) 


Professors Postions Available at Shenzhen University in Risk Management, 
Statistics; Material Physics, Soft-Matter Physics &Physical Biology; Material 
Chemistry; Marine Biology, Bio-Medicine, 


Wuhan University ( Wuhan, China) 


Job vacancies in mechanical engineering, information and communication 
engineering , power systems and automation , environmental science and 
engineering, new energy materials , transportation engineering and so on. 


Looking for more positions , please send your 
CV to acabridge@163.com 

Hotline: +86-10-62603373 +86 15300215485 
TSHRR : RE zhaojia@eol.cn +86-10-62603373 


Got *a8he8 2a Kara 


Principal investigator (Pl) recruitment - Orthopedic research — Guangzhou, 
the First Affiliated Hospital of Jinan University 


Job description 


The Department of Orthopedics of the First Affiliated Hospital at Jinan University invites applications for a full-time 
Orthopedic research PI position in the area of Orthopedic diseases, including: I) epigenetic regulation mechanisms 
of knee osteoarthritis and diabetic osteoarthritis; II) the discovery of circulating biomarkers (peptide, protein, DNA, 
exosome) for the early diagnosis of human osteoarthritis and arthritis-related diseases; III) development of novel 
approaches for the early detection of osteoarthritis and infection after total knee (hip) arthroplasty. Applicant 
should have at least a Doctorate degree (M.D. and/or Ph.D.) with senior level research experience in research lab or 
at least 2-4 years of productive post-doctoral training experiences. Applicants must demonstrate a research focus 
on the elucidation of the cellular and molecular oafthritis, diabetic osteoarthritis and 
arthritis-related diseases. Candidates will be consi a 


ing in cells and molecular 
a papers discovery and 
arthritis study. Experience in type-2-diabe , Applicants should 


send a curriculum vita, a summary of pla ence to: Director 


of Orthopedic of the First Affiliated Hospi 


Salary Range 


Refer to Type II recruitment of Jinan 
Young Scientists) 


stinguished 


Applicant Special Instructio 


Materials needed to send: CV, researchp ~~ 

Contact Name: Zhen-Gang Zha, M.D. and 

Phone: 86-20-38688617 FAX: 86-20-386880 gg @vip.163.com 

Mailing Address: Institute of Orthopedic Disea D ment of Orthope iF 


Jinan University, Guangzhou, 510630, China /y ( V et 


ated Hospital, 


Deadline for Application: Arpil 30, 2016 


K if - ¥ Faculty Recruitment of Tianjin University &Invitation of PEIYANG Forum for 
Tianjin University | Young Scholars27th-30th Apr 2016, Tianjin 


Positions 

Tianjin University (www.tju.edu,cn) invites outstanding applicants for full-time positions of full professorship, and looks for 
candidates in the following areas. 

Science or Engineering: 

Engineering, natural science, life science, pharmacy, information technology, relevant emerging inter-disciplines, etc. 

Other Academic Fields: 

Architecture, economics, business, management, social sciences, relevant emerging inter-disciplines, etc. 

Applicants with research background of multi-disciplinary and non-traditional approach are highly expected. 


Qualifications 

Applicants shall be no more than 40 years old, holding a PhD degree with at least 2 years of working experiences at abroad. 
Applicants shall be expected to be promising scholars who have obtained outstanding academic performance and international 
acknowledgements in their academic fields. 


Salary and Benefits 

The appointee, who is successfully entitled National Youth “1000-Plan” by applying through TJU, shall be qualified for the benefits as 
following: 

An aril pre-tax salary ranging from 400k to 600k, the relocation allowance of 1.5 million, the academic title of professor, the 
arrangement of spouse's work and children education. 

The salary standards in other academic fields shall be formulated in references with that of science or engineering, and can also be 
pertinently adjusted according to the identity, credentials, and academic characteristics. 


Work Supports 
The work supports concerning with the start-up funds, working space and enrolling quota of postgraduate shall be offered to 
appointee. 


Forum Arrangements 

In order to help applicants getting well known about TJU, the PEIYANG Forum for Young Scholars in Science and Engineering will be 
held from April 27 to 30, 2016 at TJU. The invitee’s travel and accommodation will be covered by TJU. Each applicant's reimbursement 
is no more than 10,000 RMB. More details can be referred to the website http://hr. tju,edu,cn/zpxx/js/ . 

The arrangements of the PEIYANG Forum for Young Scholars in other fields will be informed later. 


Deadline for Applications 

Please submit a complete application package consisting of the following documents to oplan@tju.edu.cn, The application deadline is 
15th April 2016. 

(1) A detailed application form which can be downloaded from http://hr, tju,edu,cn/zpxx/js/; 

(2) A detailed curriculum vitae; 

(3) Publication list and five full-text representative publications. 


Contacts 
Contact Persons : Ms. ZHANG Yinlu, Dr. LIU Na 
Human Resource Department, 
Tianjin University, China 
E-mail: oplan@tju.edu.cn 
Telephone: (+) 86-022-27402079 
Fax: (+) 86-022-27404177 
Address: B316/Building XINGSUN, 135 YAGUAN Road, JINNAN District, 
Tianjin, 300350 
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Ads accepted until May 20 
if space allows 


Science Careers 


For recruitment in science, there’s only one Science. 


What makes Science the best choice? Deliver your message to a 
global audience of targeted, 


= Read and respected by 400,00 readers around the globe qualified scientists. 


= 75% of readers read Science more often than any other journal 129 574 
) 


= Your ad dollars support AAAS and its programs, which : — 
strengthens the global scientific community. subscribers in print 
every week 


Why choose this Biotechnology Focus for 4 8 3 66 
7 


your advertisement? : ae 
unique active job seekers 


= Relevant ads lead off the career section with special searching for biotechnology 
biotechnology banner positions in 2015 


= Bonus distributions: 27 111 
] 


BIO, June 6-9, San Francisco, CA we : 
. y applications submitted for 
BIO Career Fair, June 9, San Francisco, CA. biotechnology positions 
in 2015 
Expand your exposure. 
Post your print ad online to benefit from: 


i M . 
= Link on the job board homepage directly to biotechnology jobs : Science ‘ 
; go 


e 


" Dedicated landing page for jobs in biotechnology 
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DANA-FARBER 


CANCER INSTITUTE 


Laboratory Based Assistant/Associate 
Professor in Ovarian Cancer Research 


The Department of Medical Oncology at the Dana-Farber Cancer 
Institute, the Gynecologic Oncology Program of the Susan F. Smith 
Center for Womens’ Cancers, and the Brigham and Women’s Hospital 
invite applications for a full-time appointment at the Assistant or 
Associate Professor level. This individual will develop an independent 
laboratory-based translational research program focused on ovarian 
cancer. The research program will interface directly with the 
translational and clinical research efforts within the Gynecologic 
Oncology program at DFC] as well as other laboratories at DFCI. 
Candidates with interests in the genomic basis of ovarian cancer, 
ovarian cancer biology, and/or immunology as well as research engaged 
in pre-clinical development of new therapeutic approaches are 
especially encouraged to apply. The candidate must have an MD and/or 
PhD and a proven track record of outstanding laboratory research. 


The candidate will work principally at the Dana-Farber Cancer Institute 
and the Brigham and Women’s Hospital. Appointment as Assistant 

or Associate Professor at the Harvard Medical School will be 
commensurate with experience, training and achievements. Salary 

and benefits will be competitive with other institutions. Dana-Farber 
Cancer Institute is an NCI-designated Comprehensive Cancer Center 
and is an equal opportunity employer. 


Interested candidates must submit a curriculum vitae, a 
research plan and 3 letters of reference to: Ursula Matulonis, 
M.D., Director, Gynecologic Oncology Program, Dana-Farber 
Cancer Institute, 450 Brookline Avenue. Boston, MA 02215. 
Please send submissions via email to: umatulonis@partners.org 


eos HARVARD BRIGHAM AND 
Se MEDICAL SCHOOL ,Y WOMEN’S HOSPITAL 


We are an equal opportunity employer and all qualified applicants will receive 
consideration for employment without regard to race, color, religion, sex, sexual 
orientation, gender identity, national origin, disability status, protected veteran status, 
or any other characteristic protected by law. 


Physician-Scientists 


The Penn State Milton S. Hershey Medical Center and Penn State 
College of Medicine in Hershey, PA, are recruiting five tenure-track 
physician-scientists at all faculty levels in any clinical or research 
discipline. These positions include 75% protected time (guaranteed 
during the four-year start-up period) for basic or translational 
research, while maintaining some clinical activity. 


Candidates should possess M.D. or M.D./Ph.D. degrees, be licensed 
to practice medicine in the US, and have completed residency, 
fellowship and/or postdoctoral training. Successful candidates will 
be expected to maintain a robust externally-funded research 
program and participate in graduate and postdoctoral training. 


Penn State College of Medicine is a collaborative research 
environment in biomedical and health sciences research with several 
interdisciplinary graduate degree programs and a physician-scientist 
pipeline that includes a required research component for all medical 
students and a vigorous M.D./Ph.D. program. The Penn State Milton 
S. Hershey Medical Center is a 551-bed, tertiary care facility that 
serves central Pennsylvania. Review of applications will begin upon 
receipt. 


Interested individuals should submit their CV, a letter of 
interest, and a description of their research program to 
physicianscirecruit@hmce.psu.edu by June 1, 2016. 


The Penn State Milton S. Hershey Medical Center is 
committed to affirmative action, equal opportunity and 
the diversity of its workforce. Equal Opportunity 
Employer - Minorities/Women/Protected 

Veterans Disabled. 


PennState Health PennState 
Milton S. Hershey College of Medicine 
Medical Center 
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UNIVERSITAT 
BERN 


The Medical Faculty of the University of Bern is accepting 
applications for a faculty position at the level of 


Full Professor (Ordinariat) for 
Biomedical Research associated 
with the Directorate of the 
Department of Biomedical Research 
starting January 1, 2017. 


The Department of Biomedical Research (DBMR) is an institution 
affiliated with the Medical Faculty of the University of Bern with 
the task of coordinating and supporting biomedical research and 
the operation of core facilities. 


Founded in 1994 as the Department of Clinical Research (DCR), 
the DBMR supports today the entire biomedical research 
conducted by the medical clinics of the Bern University Hospital 
(Inselspital). The DMBF provides an umbrella for all research 
groups with the aim to promote the exchange of important 
technologies and encourage scientific cooperation in the field of 
clinical, basic and translational research. Currently, 47 research 
groups are located within the DBMR that also encompasses a 
number of operational core facilities including flow cytometry, 
proteomics, genomics and live cell imaging. The Director DBMR is 
responsible for a total of 50 employees. 


As Director, the applicant should be a highly qualified scientist 
and be internationally recognized as a pioneer in the biomedical 
research field. He/she will be responsible for the organization of 
research infrastructure affecting all clinics of the Bern University 
Hospital (Inselspital) and direct the development of core facilities 
within the DBMR. The applicant will work with clinical research- 
ers, provide scientific advice to research groups and promote the 
continued development of junior scientists. The candidate will 
support the scientific development of the Medical Faculty by 
conducting his/her own research. A collaboration with the NCCR 
RNA and Disease Program is desired and preference will be given 
to candidates with a corresponding research interest. 


We seek applicants with exemplary leadership, management, 
communication, and networking skills. Candidates must 
demonstrate the ability to integrate expertise across disciplines 
and will be called upon to motivate different sectors and 
professional groups within the DBMR. 


A «Habilitation» or equivalent academic track record is required. 


The University of Berne is an equal opportunity employer. 
Women are particularly encouraged to apply in accordance to the 
DBMR‘s effort to increase the number of women in leadership 
positions within the teaching staff of the medical faculty. 


Further information can be obtained from the President of the 
Successor Commission, Prof. Aurel Perren, Director of the 
Institute of Pathology (E-mail: aurel.perren@pathology.unibe.ch). 


Applications must be submitted by April 22, 2016 electronically 
to the Office of the Dean (E-mail: bewerbungen@meddek. 
unibe.ch). Information regarding requested documents you find 
under http:/www.medizin.unibe.ch/dienstleistungen/fakultaere 
rechtssammlung/akademische_laufbahn/ausgeschriebene_ 
professuren/index_ger.html. 


Office of the Dean, Medical Faculty, University of Bern, 
Murtenstrasse 11, CH-3008 Bern 
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WORKING LIFE 


By Rosalind A. Segal 


1494 


Lessons from a bridge generation 


am part of a bridge generation: the middle of three generations of women working in the sciences. 
My mother, a cognitive scientist and tenured professor, received her Ph.D. in 1961. I earned my 
Ph.D. in 1985 and am now a tenured professor of neurobiology. My daughter is currently pursuing 
a Ph.D. in biology and looks forward to a career in scientific research. Looking across the three 
generations in my family, I see huge improvements in the experience of women in science, but 
obstacles still remain for my daughter’s generation. 


My mother dealt with explicit sex- 
ism throughout her career. As an 
undergraduate, her advisers sug- 
gested that she embark on a mas- 
ter’s program that would be more 
flexible for beginning a family, 
rather than the Ph.D. program she 
had in mind. When she finished 
her Ph.D., her advisers assumed it 
marked the end of her scientific 
career and that she would stay 
home to care for her growing fam- 
ily. When she applied for her first 
grant, she deliberately eliminated 
all indications of her gender so that 
the reviewers would assume that 
her first name, Sydney, was a man’s. 

In spite of the obstacles, my 
mother felt her career demon- 
strated that women could succeed 
in academic science. Because of 
her influence, I thought sexism was 
no longer an impediment to success and so could be ig- 
nored, and I was confident that I would not need to choose 
between children and career. My mentors also provided 
strong encouragement and support, which strengthened 
my belief that sexism would not be an obstacle for me. 

In retrospect, though, I recognize that my peers and I 
dealt with attitudes that would now be considered sexist, 
if not illegal. When a professor propositioned us, we as- 
sumed this was acceptable behavior, and all we could do 
was change classes and alert our friends. When our evalu- 
ations and reference letters stated we were “quiet” or we 
“asked for help frequently,’ we assumed this indicated our 
deficiencies. Only later did we realize that such comments 
are disproportionately used for women, often to the det- 
riment of their careers. These seemingly minor impedi- 
ments discouraged many of the researchers in my cohort 
from continuing in science. 

As my career progressed, I gradually became aware of 
the attitudes and policies pervading science that are dis- 
couraging to women. When I started as a tenure-track 


“I believe there is reason to 


be optimistic [for] my 
daughter's generation.” 


assistant professor with two young 
children, I chose to reduce my sal- 
ary so that I could concentrate on 
developing my research program 
without conflicting administra- 
tive duties. Instead, my depart- 
ment chair expected me to do as 
much administrative work as my 
male peers, who had negotiated 
higher salaries and better titles. 
After being awarded a grant for 
early-stage investigators, I was 
dismayed to find that I, the lone 
woman recipient, was the only one 
not asked to speak at the founda- 
tion meeting. Each episode may 
be minor, but over time, such in- 
cidents accumulate and hinder 
career advancement. 

I believe there is reason to 
be optimistic that scientists in 
my daughter’s generation are less 
likely to encounter such situations. At a recent meet- 
ing, two younger women stated that they had never ex- 
perienced any sexism. I am thrilled to hear this—and 
I hope it truly reflects an improved environment for 
women. I worry, however, that it is the same attitude I 
had early in my career, when gender issues seemed too 
minor for complaint. 

My daughter knows that sexism in science exists and 
that bias is often unconscious. This awareness prepares 
her to confront sexist behaviors politely but firmly, confi- 
dent that the behavior will change and will not impede her 
success. My hope for her and her generation of scientists 
is that continued awareness of bias will help push the sci- 
entific community to initiate reforms that prevent “minor” 
problems from accumulating and creating disparities. 


Rosalind A. Segal is a neurobiology professor at Harvard 
Medical School and co-chair of cancer biology at the Dana- 
Farber Cancer Institute in Boston. Send your story to 
SciCareerEditor @aaas.org. 
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